Parallelizing record linkage for disclosure risk assessment

Joan Guisado-Gámez, Arnau Prat-Pérez, J. Nin, Victor Muntés-Mulero, Josep Ll Larriba-Pey

Research output: Book chapterConference contributionpeer-review

4 Citations (Scopus)

Abstract

Handling very large volumes of confidential data is becoming a common practice in many organizations such as statistical agencies. This calls for the use of protection methods that have to be validated in terms of the quality they provide. With the use of Record Linkage (RL) it is possible to compute the disclosure risk, which gives a measure of the quality of a data protection method. However, the RL methods proposed in the literature are computationally costly, which poses difficulties when frequent RL processes have to be executed on large data. Here, we propose a distributed computing technique to improve the performance of a RL process. We show that our technique not only improves the computing time of a RL process significantly, but it is also scalable in a distributed environment. Also, we show that distributed computation can be complemented with SMP based parallelization in each node increasing the final speedup.

Original languageEnglish
Title of host publicationPrivacy in Statistical Databases - UNESCO Chair in Data Privacy International Conference, PSD 2008, Proceedings
PublisherSpringer Verlag
Pages190-202
Number of pages13
ISBN (Print)3540874704, 9783540874706
DOIs
Publication statusPublished - 2008
Externally publishedYes
EventInternational Conference on Privacy in Statistical Databases, PSD 2008 - Istanbul, Turkey
Duration: 24 Sept 200826 Sept 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5262 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Privacy in Statistical Databases, PSD 2008
Country/TerritoryTurkey
CityIstanbul
Period24/09/0826/09/08

Keywords

  • Disclosure risk evaluation
  • Distributed computing
  • Parallel computing
  • Record linkage

Fingerprint

Dive into the research topics of 'Parallelizing record linkage for disclosure risk assessment'. Together they form a unique fingerprint.

Cite this