Skip to main navigation Skip to search Skip to main content

Learner excellence biased by data set selection: A case for data characterisation and artificial data sets

  • Núria MacIà*
  • , Ester Bernadó-Mansilla
  • , Albert Orriols-Puig
  • , Tin Kam Ho
  • *Corresponding author for this work

    Research output: Indexed journal article Articlepeer-review

    37 Citations (Scopus)

    Abstract

    The excellence of a given learner is usually claimed through a performance comparison with other learners over a collection of data sets. Too often, researchers are not aware of the impact of their data selection on the results. Their test beds are small, and the selection of the data sets is not supported by any previous data analysis. Conclusions drawn on such test beds cannot be generalised, because particular data characteristics may favour certain learners unnoticeably. This work raises these issues and proposes the characterisation of data sets using complexity measures, which can be helpful for both guiding experimental design and explaining the behaviour of learners.

    Original languageEnglish
    Pages (from-to)1054-1066
    Number of pages13
    JournalPattern Recognition
    Volume46
    Issue number3
    DOIs
    Publication statusPublished - Mar 2013

    Keywords

    • Data complexity
    • Learner assessment
    • Supervised learning

    Fingerprint

    Dive into the research topics of 'Learner excellence biased by data set selection: A case for data characterisation and artificial data sets'. Together they form a unique fingerprint.

    Cite this