Evaluation of Glottal Inverse Filtering Techniques on OPENGLOT Synthetic Male and Female Vowels †

Marc Freixes; Luis Joglar-Ongay; Joan Claudi Socoró; Francesc Alías-Pujol

doi:10.3390/app13158775

Evaluation of Glottal Inverse Filtering Techniques on OPENGLOT Synthetic Male and Female Vowels †

Marc Freixes, Luis Joglar-Ongay, Joan Claudi Socoró, Francesc Alías-Pujol

Producció científica: Article en revista indexada › Article › Avaluat per experts

Resum

Current articulatory-based three-dimensional source–filter models, which allow the production of vowels and diphtongs, still present very limited expressiveness. Glottal inverse filtering (GIF) techniques can become instrumental to identify specific characteristics of both the glottal source signal and the vocal tract transfer function to resemble expressive speech. Several GIF methods have been proposed in the literature; however, their comparison becomes difficult due to the lack of common and exhaustive experimental settings. In this work, first, a two-phase analysis methodology for the comparison of GIF techniques based on a reference dataset is introduced. Next, state-of-the-art GIF techniques based on iterative adaptive inverse filtering (IAIF) and quasi closed phase (QCP) approaches are thoroughly evaluated on OPENGLOT, an open database specifically designed to evaluate GIF, computing well-established GIF error measures after extending male vowels with their female counterparts. The results show that GIF methods obtain better results on male vowels. The QCP-based techniques significantly outperform IAIF-based methods for almost all error metrics and scenarios and are, at the same time, more stable across sex, phonation type, F0, and vowels. The IAIF variants improve the original technique for most error metrics on male vowels, while QCP with spectral tilt compensation achieves a lower spectral tilt error for male vowels than the original QCP.

Idioma original	Anglès
Número d’article	8775
Nombre de pàgines	22
Revista	Applied Sciences (Switzerland)
Volum	13
Número	15
DOIs	https://doi.org/10.3390/app13158775
Estat de la publicació	Publicada - d’ag. 2023

Accés al document

10.3390/app13158775

Altres arxius i enllaços

Enllaç a la publicació de Scopus

Com citar-ho

@article{2d9f53ebd8b94fc9842c5db7646e73e8,

title = "Evaluation of Glottal Inverse Filtering Techniques on OPENGLOT Synthetic Male and Female Vowels †",

abstract = "Current articulatory-based three-dimensional source–filter models, which allow the production of vowels and diphtongs, still present very limited expressiveness. Glottal inverse filtering (GIF) techniques can become instrumental to identify specific characteristics of both the glottal source signal and the vocal tract transfer function to resemble expressive speech. Several GIF methods have been proposed in the literature; however, their comparison becomes difficult due to the lack of common and exhaustive experimental settings. In this work, first, a two-phase analysis methodology for the comparison of GIF techniques based on a reference dataset is introduced. Next, state-of-the-art GIF techniques based on iterative adaptive inverse filtering (IAIF) and quasi closed phase (QCP) approaches are thoroughly evaluated on OPENGLOT, an open database specifically designed to evaluate GIF, computing well-established GIF error measures after extending male vowels with their female counterparts. The results show that GIF methods obtain better results on male vowels. The QCP-based techniques significantly outperform IAIF-based methods for almost all error metrics and scenarios and are, at the same time, more stable across sex, phonation type, F0, and vowels. The IAIF variants improve the original technique for most error metrics on male vowels, while QCP with spectral tilt compensation achieves a lower spectral tilt error for male vowels than the original QCP.",

keywords = "glottal inverse filtering, glottal source, OPENGLOT, performance evaluation, phonation types, speech analysis",

author = "Marc Freixes and Luis Joglar-Ongay and Socor{\'o}, {Joan Claudi} and Francesc Al{\'i}as-Pujol",

note = "Funding Information: This research was partially funded by the Agencia Estatal de Investigaci{\'o}n (AEI) through the FEMVoQ project (PID2020-120441GB-I00/AEI/10.13039/501100011033. The authors also thank the Departament de Recerca i Universitats (Generalitat de Catalunya) for their support under Grant Ref. 2021 SGR 01396. Publisher Copyright: {\textcopyright} 2023 by the authors.",

year = "2023",

month = aug,

doi = "10.3390/app13158775",

language = "English",

volume = "13",

journal = "Applied Sciences (Switzerland)",

issn = "2076-3417",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "15",

}

TY - JOUR

T1 - Evaluation of Glottal Inverse Filtering Techniques on OPENGLOT Synthetic Male and Female Vowels †

AU - Freixes, Marc

AU - Joglar-Ongay, Luis

AU - Socoró, Joan Claudi

AU - Alías-Pujol, Francesc

N1 - Funding Information: This research was partially funded by the Agencia Estatal de Investigación (AEI) through the FEMVoQ project (PID2020-120441GB-I00/AEI/10.13039/501100011033. The authors also thank the Departament de Recerca i Universitats (Generalitat de Catalunya) for their support under Grant Ref. 2021 SGR 01396. Publisher Copyright: © 2023 by the authors.

PY - 2023/8

Y1 - 2023/8

N2 - Current articulatory-based three-dimensional source–filter models, which allow the production of vowels and diphtongs, still present very limited expressiveness. Glottal inverse filtering (GIF) techniques can become instrumental to identify specific characteristics of both the glottal source signal and the vocal tract transfer function to resemble expressive speech. Several GIF methods have been proposed in the literature; however, their comparison becomes difficult due to the lack of common and exhaustive experimental settings. In this work, first, a two-phase analysis methodology for the comparison of GIF techniques based on a reference dataset is introduced. Next, state-of-the-art GIF techniques based on iterative adaptive inverse filtering (IAIF) and quasi closed phase (QCP) approaches are thoroughly evaluated on OPENGLOT, an open database specifically designed to evaluate GIF, computing well-established GIF error measures after extending male vowels with their female counterparts. The results show that GIF methods obtain better results on male vowels. The QCP-based techniques significantly outperform IAIF-based methods for almost all error metrics and scenarios and are, at the same time, more stable across sex, phonation type, F0, and vowels. The IAIF variants improve the original technique for most error metrics on male vowels, while QCP with spectral tilt compensation achieves a lower spectral tilt error for male vowels than the original QCP.

AB - Current articulatory-based three-dimensional source–filter models, which allow the production of vowels and diphtongs, still present very limited expressiveness. Glottal inverse filtering (GIF) techniques can become instrumental to identify specific characteristics of both the glottal source signal and the vocal tract transfer function to resemble expressive speech. Several GIF methods have been proposed in the literature; however, their comparison becomes difficult due to the lack of common and exhaustive experimental settings. In this work, first, a two-phase analysis methodology for the comparison of GIF techniques based on a reference dataset is introduced. Next, state-of-the-art GIF techniques based on iterative adaptive inverse filtering (IAIF) and quasi closed phase (QCP) approaches are thoroughly evaluated on OPENGLOT, an open database specifically designed to evaluate GIF, computing well-established GIF error measures after extending male vowels with their female counterparts. The results show that GIF methods obtain better results on male vowels. The QCP-based techniques significantly outperform IAIF-based methods for almost all error metrics and scenarios and are, at the same time, more stable across sex, phonation type, F0, and vowels. The IAIF variants improve the original technique for most error metrics on male vowels, while QCP with spectral tilt compensation achieves a lower spectral tilt error for male vowels than the original QCP.

KW - glottal inverse filtering

KW - glottal source

KW - OPENGLOT

KW - performance evaluation

KW - phonation types

KW - speech analysis

UR - http://www.scopus.com/inward/record.url?scp=85167906206&partnerID=8YFLogxK

U2 - 10.3390/app13158775

DO - 10.3390/app13158775

M3 - Article

AN - SCOPUS:85167906206

SN - 2076-3417

VL - 13

JO - Applied Sciences (Switzerland)

JF - Applied Sciences (Switzerland)

IS - 15

M1 - 8775

ER -

Evaluation of Glottal Inverse Filtering Techniques on OPENGLOT Synthetic Male and Female Vowels †

Resum

Accés al document

Altres arxius i enllaços

Fingerprint

Com citar-ho