Resum
This paper describes our participation in the 9th PAN evaluation lab in the author profiling task. The proposed approach relies on the extraction of stylistic, lexicon and corpus-based features, which were combined with a logistic classifier. These three sets of features contain pairwise intersections and even some features that belong to all categories. A comprehensive comparison of the contribution of several feature subsets is presented. In particular, a set of features based on Bayesian inference provided the most important contribution. We developed our system in the Spanish training corpus, once developed it was used, with minor changes, for the English documents, too. The proposed system was ranked 6th in the official ranking for Spanish documents among 17 submitted systems. This result shows that our approach is meaningful and competitive for predicting demographics from text.
| Idioma original | Anglès |
|---|---|
| Nombre de pàgines | 9 |
| Revista | CEUR Workshop Proceedings |
| Volum | 1179 |
| Estat de la publicació | Publicada - de set. 2013 |
| Publicat externament | Sí |
| Esdeveniment | 2013 Cross Language Evaluation Forum Conference, CLEF 2013 - Valencia, Spain Durada: 23 de set. 2013 → 26 de set. 2013 |
Fingerprint
Navegar pels temes de recerca de 'Author profiling using corpus statistics, lexicons and stylistic features: Notebook for PAN at CLEF-2013'. Junts formen un fingerprint únic.Com citar-ho
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver