TY - GEN
T1 - Beyond multivariate microaggregation for large record anonymization
AU - Nin, J.
N1 - Funding Information:
This work is partially supported by the Ministry of Science and Technology of Spain under contract TIN2012-34557 and by the BSC-CNS Severo Ochoa program (SEV-2011-00067)
Publisher Copyright:
© Springer International Publishing Switzerland 2014
PY - 2014
Y1 - 2014
N2 - Microaggregation is one of the most commonly employed microdata protection methods. The basic idea of microaggregation is to anonymize data by aggregating original records into small groups of at least k elements and, therefore, preserving k-anonymity. Usually, in order to avoid information loss, when records are large, i.e., the number of attributes of the data set is large, this data set is split into smaller blocks of attributes and microaggregation is applied to each block, successively and independently. This is called multivariate microaggregation. By using this technique, the information loss after collapsing several values to the centroid of their group is reduced. Unfortunately, with multivariate microaggregation, the k-anonymity property is lost when at least two attributes of different blocks are known by the intruder, which might be the usual case. In this work, we present a new microaggregation method called one dimension microaggregation (Mic1D − k). With Mic1D − k, the problem of k-anonymity loss is mitigated by mixing all the values in the original microdata file into a single non-attributed data set using a set of simple pre-processing steps and then, microaggregating all the mixed values together. Our experiments show that, using real data, our proposal obtains lower disclosure risk than previous approaches whereas the information loss is preserved.
AB - Microaggregation is one of the most commonly employed microdata protection methods. The basic idea of microaggregation is to anonymize data by aggregating original records into small groups of at least k elements and, therefore, preserving k-anonymity. Usually, in order to avoid information loss, when records are large, i.e., the number of attributes of the data set is large, this data set is split into smaller blocks of attributes and microaggregation is applied to each block, successively and independently. This is called multivariate microaggregation. By using this technique, the information loss after collapsing several values to the centroid of their group is reduced. Unfortunately, with multivariate microaggregation, the k-anonymity property is lost when at least two attributes of different blocks are known by the intruder, which might be the usual case. In this work, we present a new microaggregation method called one dimension microaggregation (Mic1D − k). With Mic1D − k, the problem of k-anonymity loss is mitigated by mixing all the values in the original microdata file into a single non-attributed data set using a set of simple pre-processing steps and then, microaggregating all the mixed values together. Our experiments show that, using real data, our proposal obtains lower disclosure risk than previous approaches whereas the information loss is preserved.
KW - K-anonymity
KW - Microaggregation
KW - Privacy in statistical databases
UR - http://www.scopus.com/inward/record.url?scp=84927652398&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-04178-0_8
DO - 10.1007/978-3-319-04178-0_8
M3 - Conference contribution
AN - SCOPUS:84927652398
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 87
EP - 107
BT - Citizen in Sensor Networks - 2nd International Workshop, CitiSens 2013, Revised Selected Papers
A2 - Nin, Jordi
A2 - Villatoro, Daniel
PB - Springer Verlag
T2 - 2nd International Workshop on Citizen in Sensor Networks, CitiSens 2013
Y2 - 19 September 2013 through 19 September 2013
ER -