Preliminary approach on synthetic data sets generation based on class separability measure

Núria Macià, Ester Bernadó-Mansilla, Albert Orriols-Puig

    Producció científica: Capítol de llibreContribució a congrés/conferènciaAvaluat per experts

    20 Cites (Scopus)


    Usually, performance of classifiers is evaluated on real-world problems that mainly belong to public repositories. However, we ignore the inherent properties of these data and how they affect classifier behavior. Also, the high cost or the difficulty of experiments hinder the data collection, leading to complex data sets characterized by few instances, missing values, and imprecise data. The generation of synthetic data sets solves both issues and allows us to build problems with a minor cost and whose characteristics are predefined. This is useful to test system limitations in a controlled frame-work. This paper proposes to generate synthetic data sets based on data complexity. We rely on the length of the class boundary to build the data sets, obtaining a preliminary set of benchmarks to assess classifier accuracy. The study can be further matured to identify regions of competence for classifiers.

    Idioma originalAnglès
    Títol de la publicació2008 19th International Conference on Pattern Recognition, ICPR 2008
    EditorInstitute of Electrical and Electronics Engineers Inc.
    ISBN (imprès)9781424421756
    Estat de la publicacióPublicada - 2008

    Sèrie de publicacions

    NomProceedings - International Conference on Pattern Recognition
    ISSN (imprès)1051-4651


    Navegar pels temes de recerca de 'Preliminary approach on synthetic data sets generation based on class separability measure'. Junts formen un fingerprint únic.

    Com citar-ho