TY - GEN
T1 - Dataset and Evaluation of Automatic Speech Recognition for Multi-lingual Intent Recognition on Social Robots
AU - Andriella, Antonio
AU - Ros, Raquel
AU - Ellinson, Yoav
AU - Gannot, Sharon
AU - Lemaignan, Séverin
N1 - Publisher Copyright:
© 2024 IEEE Computer Society. All rights reserved.
PY - 2024/3/11
Y1 - 2024/3/11
N2 - While Automatic Speech Recognition (ASR) systems excel in controlled environments, challenges arise in robot-specific setups due to unique microphone requirements and added noise sources. In this paper, we create a dataset of initiating conversations with brief exchanges in 5 European languages, and we systematically evaluate current state-of-art ASR systems (Vosk, OpenWhisper, Google Speech and NVidia Riva). Besides standard metrics, we also look at two critical downstream tasks for human-robot verbal interaction: intent recognition rate and entity extraction, using the open-source Rasa chatbot. Overall, we found that open-source solutions as Vosk performs competitively with closed-source solutions while running on the edge, on a low compute budget (CPU only).
AB - While Automatic Speech Recognition (ASR) systems excel in controlled environments, challenges arise in robot-specific setups due to unique microphone requirements and added noise sources. In this paper, we create a dataset of initiating conversations with brief exchanges in 5 European languages, and we systematically evaluate current state-of-art ASR systems (Vosk, OpenWhisper, Google Speech and NVidia Riva). Besides standard metrics, we also look at two critical downstream tasks for human-robot verbal interaction: intent recognition rate and entity extraction, using the open-source Rasa chatbot. Overall, we found that open-source solutions as Vosk performs competitively with closed-source solutions while running on the edge, on a low compute budget (CPU only).
KW - Assistive Robotics
KW - Audio Dataset
KW - Automatic Speech Recognition
KW - Human-Robot Interaction
UR - http://www.scopus.com/inward/record.url?scp=85188418358&partnerID=8YFLogxK
U2 - 10.1145/3610977.3637473
DO - 10.1145/3610977.3637473
M3 - Conference contribution
AN - SCOPUS:85188418358
T3 - ACM/IEEE International Conference on Human-Robot Interaction
SP - 865
EP - 869
BT - HRI 2024 - Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction
PB - IEEE Computer Society
T2 - 19th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2024
Y2 - 11 March 2024 through 15 March 2024
ER -