Using genetic algorithms for attribute grouping in multivariate microaggregation

Jordi Balasch Masoliver, Victor Muntés Mulero, J. Nin

Producció científica: Article en revista indexadaArticleAvaluat per experts

Resum

Anonymization techniques that provide k-anonymity suffer from loss of quality when data dimensionality is high. Microaggregation techniques are not an exception. Given a set of records, attributes are grouped into non-intersecting subsets and microaggregated independently. While this improves quality by reducing the loss of information, it usually leads to the loss of the k-anonymity property, increasing entity disclosure risk. In spite of this, grouping attributes is still a common practice for data sets containing a large number of records. Depending on the attributes chosen and their correlation, the amount of information loss and disclosure risk vary. However, there have not been serious attempts to propose a way to find the best way of grouping attributes. In this paper, we present GOMM, the Genetic Optimizer for Multivariate Microaggregation which, as far as we know, represents the first proposal using evolutionary algorithms for this problem. The goal of GOMM is finding the optimal, or near-optimal, attribute grouping taking into account both information loss and disclosure risk. We propose a way to map attribute subsets into a chromosome and a set of new mutation operations for this context. Also, we provide a comprehensive analysis of the operations proposed and we show that, after using our evolutionary approach for different real data sets, we obtain better quality in the anonymized data comparing it to previously used ad-hoc attribute grouping techniques. Additionally, we provide an improved version of GOMM called D-GOMM where operations are dynamically executed during the optimization process to reduce the GOMM execution time.
Idioma originalAnglès
Pàgines (de-a)819-836
RevistaIntelligent Data Analysis
Volum18
DOIs
Estat de la publicacióPublicada - 1 de jul. 2014

Fingerprint

Navegar pels temes de recerca de 'Using genetic algorithms for attribute grouping in multivariate microaggregation'. Junts formen un fingerprint únic.

Com citar-ho