Semantic blocking for Record Linkage

Jordi Nin, Víctor Muntés-Mulero, Norbert MartíNez-Bazan, Josep L. Larriba-Pey

Producció científica: Capítol de llibreContribució a congrés/conferènciaAvaluat per experts

Resum

Record Linkage (RL) is an important component of data cleaning and integration and data processing in general. For years, many efforts have focused on improving the performance of the RL process, either by reducing the number of record comparisons or reducing the number of attribute comparisons, which reduces the computational time, but increases the amount of error. However, the real bottleneck of RL is the post-process, where the results have to be reviewed by experts that decide which pairs or groups of records are real links and which are false hits. In this paper we show that exploiting the semantic relationships (e.g. foreign key), established between one or more data sources, makes it possible to find a new sort of semantic blocking method that improves the number of hits and reduces the amount of review effort.

Idioma originalAnglès
Títol de la publicacióArtificial Intelligence Research and Development
Pàgines141-149
Nombre de pàgines9
Estat de la publicacióPublicada - 2007
Publicat externament
Esdeveniment10th International Conference of the Catalan Association for Artificial Intelligence, CCIA 2007 - Sant Julia de Loria, Andorra
Durada: 25 d’oct. 200726 d’oct. 2007

Sèrie de publicacions

NomFrontiers in Artificial Intelligence and Applications
Volum163
ISSN (imprès)0922-6389

Conferència

Conferència10th International Conference of the Catalan Association for Artificial Intelligence, CCIA 2007
País/TerritoriAndorra
CiutatSant Julia de Loria
Període25/10/0726/10/07

Fingerprint

Navegar pels temes de recerca de 'Semantic blocking for Record Linkage'. Junts formen un fingerprint únic.

Com citar-ho