Feature diversity in cluster ensembles for robust document clustering

Xavier Sevillano*, Germán Cobo, Francesc Alías, Joan Claudi Socoró

*Corresponding author for this work

Research output: Book chapterConference contributionpeer-review

13 Citations (Scopus)

Abstract

The performance of document clustering systems depends on employing optimal text representations, which are not only difficult to determine beforehand, but also may vary from one clustering problem to another. As a first step towards building robust document clusterers, a strategy based on feature diversity and cluster ensembles is presented in this work. Experiments conducted on a binary clustering problem show that our method is robust to near-optimal model order selection and able to detect constructive interactions between different document representations in the test bed.

Original languageEnglish
Title of host publicationProceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery (ACM)
Pages697-698
Number of pages2
ISBN (Print)1595933697, 9781595933690
DOIs
Publication statusPublished - 2006
Event29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - Seatttle, WA, United States
Duration: 6 Aug 200611 Aug 2006

Publication series

NameProceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Volume2006

Conference

Conference29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Country/TerritoryUnited States
CitySeatttle, WA
Period6/08/0611/08/06

Keywords

  • Cluster ensembles
  • Document clustering
  • Feature extraction

Fingerprint

Dive into the research topics of 'Feature diversity in cluster ensembles for robust document clustering'. Together they form a unique fingerprint.

Cite this