TY - GEN
T1 - Preliminary approach on synthetic data sets generation based on class separability measure
AU - Macià, Núria
AU - Bernadó-Mansilla, Ester
AU - Orriols-Puig, Albert
PY - 2008
Y1 - 2008
N2 - Usually, performance of classifiers is evaluated on real-world problems that mainly belong to public repositories. However, we ignore the inherent properties of these data and how they affect classifier behavior. Also, the high cost or the difficulty of experiments hinder the data collection, leading to complex data sets characterized by few instances, missing values, and imprecise data. The generation of synthetic data sets solves both issues and allows us to build problems with a minor cost and whose characteristics are predefined. This is useful to test system limitations in a controlled frame-work. This paper proposes to generate synthetic data sets based on data complexity. We rely on the length of the class boundary to build the data sets, obtaining a preliminary set of benchmarks to assess classifier accuracy. The study can be further matured to identify regions of competence for classifiers.
AB - Usually, performance of classifiers is evaluated on real-world problems that mainly belong to public repositories. However, we ignore the inherent properties of these data and how they affect classifier behavior. Also, the high cost or the difficulty of experiments hinder the data collection, leading to complex data sets characterized by few instances, missing values, and imprecise data. The generation of synthetic data sets solves both issues and allows us to build problems with a minor cost and whose characteristics are predefined. This is useful to test system limitations in a controlled frame-work. This paper proposes to generate synthetic data sets based on data complexity. We rely on the length of the class boundary to build the data sets, obtaining a preliminary set of benchmarks to assess classifier accuracy. The study can be further matured to identify regions of competence for classifiers.
UR - http://www.scopus.com/inward/record.url?scp=77958104698&partnerID=8YFLogxK
U2 - 10.1109/icpr.2008.4761770
DO - 10.1109/icpr.2008.4761770
M3 - Conference contribution
AN - SCOPUS:77958104698
SN - 9781424421756
T3 - Proceedings - International Conference on Pattern Recognition
BT - 2008 19th International Conference on Pattern Recognition, ICPR 2008
PB - Institute of Electrical and Electronics Engineers Inc.
ER -