TY - GEN
T1 - Experimenting with Affective Computing Models in Video Interviews with Spanish-Speaking Older Adults
AU - López Camuñas, Josep
AU - Bustos, Cristina
AU - Zhu, Yanjun
AU - Ros, Raquel
AU - Lapedriza, Agata
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Understanding emotional signals in older adults is crucial for designing virtual assistants that support their well-being. However, existing affective computing models often face significant limitations: (1) limited availability of datasets representing older adults, especially in non-English-speaking populations, and (2) poor generalization of models trained on younger or homogeneous demographics. To address these gaps, this study evaluates state-of-the-art affective computing models-including facial expression recognition, text sentiment analysis, and smile detection-using videos of older adults interacting with either a person or a virtual avatar. As part of this effort, we introduce a novel dataset featuring Spanish-speaking older adults engaged in human-to-human video interviews. Through three comprehensive analyses, we investigate (1) the alignment between human-annotated labels and automatic model outputs, (2) the relationships between model outputs across different modalities, and (3) individual variations in emotional signals. Using both the Wizard of Oz (WoZ) dataset and our newly collected dataset, we uncover limited agreement between human annotations and model predictions, weak consistency across modalities, and significant variability among individuals. These findings high-light the shortcomings of generalized emotion perception models and emphasize the need of incorporating personal variability and cultural nuances into future systems.
AB - Understanding emotional signals in older adults is crucial for designing virtual assistants that support their well-being. However, existing affective computing models often face significant limitations: (1) limited availability of datasets representing older adults, especially in non-English-speaking populations, and (2) poor generalization of models trained on younger or homogeneous demographics. To address these gaps, this study evaluates state-of-the-art affective computing models-including facial expression recognition, text sentiment analysis, and smile detection-using videos of older adults interacting with either a person or a virtual avatar. As part of this effort, we introduce a novel dataset featuring Spanish-speaking older adults engaged in human-to-human video interviews. Through three comprehensive analyses, we investigate (1) the alignment between human-annotated labels and automatic model outputs, (2) the relationships between model outputs across different modalities, and (3) individual variations in emotional signals. Using both the Wizard of Oz (WoZ) dataset and our newly collected dataset, we uncover limited agreement between human annotations and model predictions, weak consistency across modalities, and significant variability among individuals. These findings high-light the shortcomings of generalized emotion perception models and emphasize the need of incorporating personal variability and cultural nuances into future systems.
KW - affective computing
KW - emotion perception
KW - emotion recognition
KW - facial expression recognition
KW - multimodal analysis
KW - text sentiment analysis
UR - http://www.scopus.com/inward/record.url?scp=105005025164&partnerID=8YFLogxK
U2 - 10.1109/WACVW65960.2025.00006
DO - 10.1109/WACVW65960.2025.00006
M3 - Conference contribution
AN - SCOPUS:105005025164
T3 - Proceedings - 2025 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, WACVW 2025
SP - 1
EP - 10
BT - Proceedings - 2025 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, WACVW 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, WACVW 2025
Y2 - 28 February 2025 through 4 March 2025
ER -