TY - GEN
T1 - Using classification methods to evaluate attribute disclosure risk
AU - Nin, J.
AU - Herranz, Javier
AU - Torra, Vicenç
N1 - Funding Information:
Partial support by the Spanish MEC (projects ARES – CONSOLIDER INGENIO 2010 CSD2007-00004 – and eAEGIS – TSI2007-65406-C03-02) is acknowledged. Javier Herranz enjoys a Ramón y Cajal grant, partially funded by the European Social Fund (ESF), from Spanish MICINN Ministry. Jordi Nin is partially supported by the European Community through the 7th Framework Programme Marie Curie Intra-European fellowship, contract No 235226.
PY - 2010
Y1 - 2010
N2 - Statistical Disclosure Control protection methods perturb the non-confidential attributes of an original dataset and publish the perturbed results along with the values of confidential attributes. Traditionally, such a method is considered to achieve a good privacy level if attackers who try to link an original record with its perturbed counterpart have a low success probability. Another opinion is lately gaining popularity: the protection methods should resist not only record re-identification attacks, but also attacks that try to guess the true value of some confidential attribute of some original record(s). This is known as attribute disclosure risk. In this paper we propose a quite simple strategy to estimate the attribute disclosure risk suffered by a protection method: using a classifier, constructed from the protected (public) dataset, to predict the attribute values of some original record. After defining this approach in detail, we describe some experiments that show the power and danger of the approach: very popular protection methods suffer from very high attribute disclosure risk values.
AB - Statistical Disclosure Control protection methods perturb the non-confidential attributes of an original dataset and publish the perturbed results along with the values of confidential attributes. Traditionally, such a method is considered to achieve a good privacy level if attackers who try to link an original record with its perturbed counterpart have a low success probability. Another opinion is lately gaining popularity: the protection methods should resist not only record re-identification attacks, but also attacks that try to guess the true value of some confidential attribute of some original record(s). This is known as attribute disclosure risk. In this paper we propose a quite simple strategy to estimate the attribute disclosure risk suffered by a protection method: using a classifier, constructed from the protected (public) dataset, to predict the attribute values of some original record. After defining this approach in detail, we describe some experiments that show the power and danger of the approach: very popular protection methods suffer from very high attribute disclosure risk values.
KW - Attribute disclosure control
KW - Classification
KW - Privacy-preserving data perturbation
UR - http://www.scopus.com/inward/record.url?scp=79956288030&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-16292-3_27
DO - 10.1007/978-3-642-16292-3_27
M3 - Conference contribution
AN - SCOPUS:79956288030
SN - 3642162916
SN - 9783642162916
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 277
EP - 286
BT - Modeling Decisions for Artificial Intelligence - 7th International Conference, MDAI 2010, Proceedings
T2 - 7th International Conference on Modeling Decisions for Artificial Intelligence, MDAI 2010
Y2 - 27 October 2010 through 29 October 2010
ER -