TY - JOUR
T1 - DEArt
T2 - Building and evaluating a dataset for object detection and pose classification for European art
AU - Reshetnikov, Artem
AU - Marinescu, Maria Cristina
AU - More Lopez, Joaquim
AU - Mendoza, Sergio
AU - Freire, Nuno
AU - Marrero, Monica
AU - Tsoupra, Eleftheria
AU - Isaac, Antoine
N1 - Publisher Copyright:
© 2025
PY - 2025/10/1
Y1 - 2025/10/1
N2 - Annotation of cultural heritage artefacts allows finding and exploration of items relevant to user needs, supports functionality such as question answering or scene understanding, and in general facilitates the exposure of the society to our history and heritage. But most artefacts lack a description of their visual content due to the assumption that one sees the object; this often means that the annotations effort focuses on the historical and artistic context, information about the painter, or details about the execution and medium. Without a significant body of visual content annotation, machines cannot integrate all this data to allow further analysis, query and inference, and cultural institutions cannot offer advanced functionality to their users and visitors. Given how time-consuming manual annotation is, and to enable the development of new technology and applications for cultural heritage, we have provided through DEArt the most extensive art dataset for object detection and pose classification to date. The current paper extends this work in several ways: (1) we introduce an approach for generating refined object and relationship labels without the need for manual annotations, (2) we compare the performance of our models with the most relevant state-of-the-art in both computer vision and cultural heritage, (3) we evaluate the annotations generated by our object detection model from a user viewpoint, for both correctness and relevance, and (4) we briefly discuss the fairness of our dataset.
AB - Annotation of cultural heritage artefacts allows finding and exploration of items relevant to user needs, supports functionality such as question answering or scene understanding, and in general facilitates the exposure of the society to our history and heritage. But most artefacts lack a description of their visual content due to the assumption that one sees the object; this often means that the annotations effort focuses on the historical and artistic context, information about the painter, or details about the execution and medium. Without a significant body of visual content annotation, machines cannot integrate all this data to allow further analysis, query and inference, and cultural institutions cannot offer advanced functionality to their users and visitors. Given how time-consuming manual annotation is, and to enable the development of new technology and applications for cultural heritage, we have provided through DEArt the most extensive art dataset for object detection and pose classification to date. The current paper extends this work in several ways: (1) we introduce an approach for generating refined object and relationship labels without the need for manual annotations, (2) we compare the performance of our models with the most relevant state-of-the-art in both computer vision and cultural heritage, (3) we evaluate the annotations generated by our object detection model from a user viewpoint, for both correctness and relevance, and (4) we briefly discuss the fairness of our dataset.
KW - Dataset/model fairness
KW - Deep learning architecture
KW - Iconographic art
KW - Object detection
KW - Pose classification
UR - https://www.scopus.com/pages/publications/105012984368
UR - http://hdl.handle.net/20.500.14342/5499
U2 - 10.1016/j.culher.2025.07.022
DO - 10.1016/j.culher.2025.07.022
M3 - Article
AN - SCOPUS:105012984368
SN - 1296-2074
VL - 75
SP - 258
EP - 266
JO - Journal of Cultural Heritage
JF - Journal of Cultural Heritage
ER -