On the use of semantic blocking techniques for data cleansing and integration

Jordi Nin, Victor Muntés-Mulero, Norbert Martínez-Bazan, Josep L. Larriba-Pey

Producción científica: Capítulo del libroContribución a congreso/conferenciarevisión exhaustiva

18 Citas (Scopus)

Resumen

Record Linkage (RL) is an important component of data cleansing and integration. For years, many efforts have focused on improving the performance of the RL process, either by reducing the number of record comparisons or by reducing the number of attribute comparisons, which reduces the computational time, but very often decreases the quality of the results. However, the real bottleneck of RL is the post-process, where the results have to be reviewed by experts that decide which pairs or groups of records are real links and which are false hits. In this paper, we show that exploiting the relationships (e.g. foreign key) established between one or more data sources, makes it possible to find a new sort of semantic blocking method that improves the number of hits and reduces the amount of review effort.

Idioma originalInglés
Título de la publicación alojada11th International Database Engineering and Applications Symposium Proceedings, IDEAS
Páginas190-198
Número de páginas9
DOI
EstadoPublicada - 2007
Publicado de forma externa
Evento11th International Database Engineering and Applications Symposium - IDEAS'2007 - Banff, AB, Canadá
Duración: 6 sept 20078 sept 2007

Serie de la publicación

NombreProceedings of the International Database Engineering and Applications Symposium, IDEAS
ISSN (versión impresa)1098-8068

Conferencia

Conferencia11th International Database Engineering and Applications Symposium - IDEAS'2007
País/TerritorioCanadá
CiudadBanff, AB
Período6/09/078/09/07

Huella

Profundice en los temas de investigación de 'On the use of semantic blocking techniques for data cleansing and integration'. En conjunto forman una huella única.

Citar esto