TY - JOUR
T1 - Sound event detection by intermittency ratio criterium and source classification by deep learning techniques
AU - Vidaña-Vila, Ester
AU - Brambilla, Giovanni
AU - Alsina-Pagès, Rosa Ma
N1 - Publisher Copyright:
© 2025 the author(s), published by De Gruyter.
PY - 2025/1/1
Y1 - 2025/1/1
N2 - Urban environments are characterized by a complex interplay of various sound sources, which significantly influence the overall soundscape quality. This study presents a methodology that combines the intermittency ratio (IR) metric for acoustic event detection with deep learning (DL) techniques for the classification of sound sources associated with these events. The aim is to provide an automated tool for detecting and categorizing polyphonic acoustic events, thereby enhancing our ability to assess and manage environmental noise. Using a dataset collected in the city center of Barcelona, our results demonstrate the effectiveness of the IR metric in successfully detecting events from diverse categories. Specifically, the IR captures the temporal variations of sound pressure levels due to significant noise events, enabling their detection but not providing information on the associated sound sources. To fill this weakness, the DL-based classification system, which uses a MobileNet convolutional neural network, shows promise in identifying foreground sound sources. Our findings highlight the potential of DL techniques to automate the classification of sound sources, providing valuable insights into the acoustic environment. The proposed methodology of combining the two above techniques represents a step forward in automating acoustic event detection and classification in urban soundscapes and providing important information to manage noise mitigation actions.
AB - Urban environments are characterized by a complex interplay of various sound sources, which significantly influence the overall soundscape quality. This study presents a methodology that combines the intermittency ratio (IR) metric for acoustic event detection with deep learning (DL) techniques for the classification of sound sources associated with these events. The aim is to provide an automated tool for detecting and categorizing polyphonic acoustic events, thereby enhancing our ability to assess and manage environmental noise. Using a dataset collected in the city center of Barcelona, our results demonstrate the effectiveness of the IR metric in successfully detecting events from diverse categories. Specifically, the IR captures the temporal variations of sound pressure levels due to significant noise events, enabling their detection but not providing information on the associated sound sources. To fill this weakness, the DL-based classification system, which uses a MobileNet convolutional neural network, shows promise in identifying foreground sound sources. Our findings highlight the potential of DL techniques to automate the classification of sound sources, providing valuable insights into the acoustic environment. The proposed methodology of combining the two above techniques represents a step forward in automating acoustic event detection and classification in urban soundscapes and providing important information to manage noise mitigation actions.
KW - convolutional neural network
KW - intermittency ratio
KW - sound event detection
KW - sound source identification
KW - urban noise
UR - http://www.scopus.com/inward/record.url?scp=105003430206&partnerID=8YFLogxK
U2 - 10.1515/noise-2024-0014
DO - 10.1515/noise-2024-0014
M3 - Article
AN - SCOPUS:105003430206
SN - 2084-879X
VL - 12
JO - Noise Mapping
JF - Noise Mapping
IS - 1
M1 - 20240014
ER -