Skip to main navigation Skip to search Skip to main content

Semantic blocking for Record Linkage

  • J. Nin
  • , Víctor Muntés-Mulero
  • , Norbert MartíNez-Bazan
  • , Josep L. Larriba-Pey

Research output: Book chapterConference contributionpeer-review

Abstract

Record Linkage (RL) is an important component of data cleaning and integration and data processing in general. For years, many efforts have focused on improving the performance of the RL process, either by reducing the number of record comparisons or reducing the number of attribute comparisons, which reduces the computational time, but increases the amount of error. However, the real bottleneck of RL is the post-process, where the results have to be reviewed by experts that decide which pairs or groups of records are real links and which are false hits. In this paper we show that exploiting the semantic relationships (e.g. foreign key), established between one or more data sources, makes it possible to find a new sort of semantic blocking method that improves the number of hits and reduces the amount of review effort.

Original languageEnglish
Title of host publicationArtificial Intelligence Research and Development
Pages141-149
Number of pages9
Publication statusPublished - 2007
Externally publishedYes
Event10th International Conference of the Catalan Association for Artificial Intelligence, CCIA 2007 - Sant Julia de Loria, Andorra
Duration: 25 Oct 200726 Oct 2007

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume163
ISSN (Print)0922-6389

Conference

Conference10th International Conference of the Catalan Association for Artificial Intelligence, CCIA 2007
Country/TerritoryAndorra
CitySant Julia de Loria
Period25/10/0726/10/07

Keywords

  • Blocking algorithms
  • Data cleansing
  • Data integration
  • Data processing
  • Record Linkage
  • Semantic information

Fingerprint

Dive into the research topics of 'Semantic blocking for Record Linkage'. Together they form a unique fingerprint.

Cite this