Saltar a la navegació principal Saltar a la cerca Vés al contingut principal

Analysis of Representation and Generalization Capabilities of Pre-Trained Audio Models in Urban Environments

Producció científica: Capítol de llibreContribució a congrés/conferènciaAvaluat per experts

Resum

In the last decade, urban noise pollution has become a significant environmental concern that can be mitigated with the help of audio detection algorithms for classifying different sources of noise and creating more informative noise maps. In this context, machine learning, particularly transfer learning, is an essential technology that enables accurate analysis of urban noise sources. However, the choice of the pre-trained model used to compute audio embeddings can significantly influence the performance of downstream classification tasks. This paper aims to compare the embeddings of various pre-trained models on different data collection campaigns in the context of the Sons al balcó project and quantify the robustness of audio representations. To achieve this, we develop metrics and statistically test the presence of distribution shifts in learned latent features. To evaluate the quality of the embeddings, we perform both qualitative and quantitative analysis using dimensionality reduction methods and assess the performance on downstream tasks using data from different collection campaigns. Results highlight major differences between general purpose and specific models. Our findings suggest the need for careful consideration during the choice of the pre-trained model to use in audio event detection applications.

Idioma originalAnglès
Títol de la publicació53rd International Congress and Exposition on Noise Control Engineering, Internoise 2024
EditorSociete Francaise d'Acoustique
Pàgines5979-5990
Nombre de pàgines12
ISBN (electrònic)9798331322151
DOIs
Estat de la publicacióPublicada - 2024
Esdeveniment53rd International Congress and Exposition on Noise Control Engineering, Internoise 2024 - Nantes, France
Durada: 25 d’ag. 202429 d’ag. 2024

Sèrie de publicacions

Nom53rd International Congress and Exposition on Noise Control Engineering, Internoise 2024
Volum8

Conferència

Conferència53rd International Congress and Exposition on Noise Control Engineering, Internoise 2024
País/TerritoriFrance
CiutatNantes
Període25/08/2429/08/24

SDG de les Nacions Unides

Aquest resultat contribueix als següents objectius de desenvolupament sostenible.

  1. ODS 11 - Ciutats i comunitats sostenibles
    ODS 11 Ciutats i comunitats sostenibles

Fingerprint

Navegar pels temes de recerca de 'Analysis of Representation and Generalization Capabilities of Pre-Trained Audio Models in Urban Environments'. Junts formen un fingerprint únic.

Com citar-ho