TY - JOUR
T1 - Evaluation of Glottal Inverse Filtering Techniques on OPENGLOT Synthetic Male and Female Vowels †
AU - Freixes, Marc
AU - Joglar-Ongay, Luis
AU - Socoró, Joan Claudi
AU - Alías-Pujol, Francesc
N1 - Funding Information:
This research was partially funded by the Agencia Estatal de Investigación (AEI) through the FEMVoQ project (PID2020-120441GB-I00/AEI/10.13039/501100011033. The authors also thank the Departament de Recerca i Universitats (Generalitat de Catalunya) for their support under Grant Ref. 2021 SGR 01396.
Publisher Copyright:
© 2023 by the authors.
PY - 2023/8
Y1 - 2023/8
N2 - Current articulatory-based three-dimensional source–filter models, which allow the production of vowels and diphtongs, still present very limited expressiveness. Glottal inverse filtering (GIF) techniques can become instrumental to identify specific characteristics of both the glottal source signal and the vocal tract transfer function to resemble expressive speech. Several GIF methods have been proposed in the literature; however, their comparison becomes difficult due to the lack of common and exhaustive experimental settings. In this work, first, a two-phase analysis methodology for the comparison of GIF techniques based on a reference dataset is introduced. Next, state-of-the-art GIF techniques based on iterative adaptive inverse filtering (IAIF) and quasi closed phase (QCP) approaches are thoroughly evaluated on OPENGLOT, an open database specifically designed to evaluate GIF, computing well-established GIF error measures after extending male vowels with their female counterparts. The results show that GIF methods obtain better results on male vowels. The QCP-based techniques significantly outperform IAIF-based methods for almost all error metrics and scenarios and are, at the same time, more stable across sex, phonation type, F0, and vowels. The IAIF variants improve the original technique for most error metrics on male vowels, while QCP with spectral tilt compensation achieves a lower spectral tilt error for male vowels than the original QCP.
AB - Current articulatory-based three-dimensional source–filter models, which allow the production of vowels and diphtongs, still present very limited expressiveness. Glottal inverse filtering (GIF) techniques can become instrumental to identify specific characteristics of both the glottal source signal and the vocal tract transfer function to resemble expressive speech. Several GIF methods have been proposed in the literature; however, their comparison becomes difficult due to the lack of common and exhaustive experimental settings. In this work, first, a two-phase analysis methodology for the comparison of GIF techniques based on a reference dataset is introduced. Next, state-of-the-art GIF techniques based on iterative adaptive inverse filtering (IAIF) and quasi closed phase (QCP) approaches are thoroughly evaluated on OPENGLOT, an open database specifically designed to evaluate GIF, computing well-established GIF error measures after extending male vowels with their female counterparts. The results show that GIF methods obtain better results on male vowels. The QCP-based techniques significantly outperform IAIF-based methods for almost all error metrics and scenarios and are, at the same time, more stable across sex, phonation type, F0, and vowels. The IAIF variants improve the original technique for most error metrics on male vowels, while QCP with spectral tilt compensation achieves a lower spectral tilt error for male vowels than the original QCP.
KW - glottal inverse filtering
KW - glottal source
KW - OPENGLOT
KW - performance evaluation
KW - phonation types
KW - speech analysis
UR - http://www.scopus.com/inward/record.url?scp=85167906206&partnerID=8YFLogxK
U2 - 10.3390/app13158775
DO - 10.3390/app13158775
M3 - Article
AN - SCOPUS:85167906206
SN - 2076-3417
VL - 13
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 15
M1 - 8775
ER -