Glottal source contribution to higher order modes in the finite element synthesis of vowels

Marc Freixes; Marc Arnela; Joan Claudi Socoró; Francesc Alías; Oriol Guasch

doi:10.3390/app9214535

Glottal source contribution to higher order modes in the finite element synthesis of vowels

Marc Freixes, Marc Arnela, Joan Claudi Socoró, Francesc Alías, Oriol Guasch

Producció científica: Article en revista indexada › Article › Avaluat per experts

8 Cites (Scopus)

Resum

Articulatory speech synthesis has long been based on one-dimensional (1D) approaches. They assume plane wave propagation within the vocal tract and disregard higher order modes that typically appear above 5 kHz. However, such modes may be relevant in obtaining a more natural voice, especially for phonation types with significant high frequency energy (HFE) content. This work studies the contribution of the glottal source at high frequencies in the 3D numerical synthesis of vowels. The spoken vocal range is explored using an LF (Liljencrants-Fant) model enhanced with aspiration noise and controlled by the R_d glottal shape parameter. The vowels [A], [i], and [u] are generated with a finite element method (FEM) using realistic 3D vocal tract geometries obtained from magnetic resonance imaging (MRI), as well as simplified straight vocal tracts of a circular cross-sectional area. The symmetry of the latter prevents the onset of higher order modes. Thus, the comparison between realistic and simplified geometries enables us to analyse the influence of such modes. The simulations indicate that higher order modes may be perceptually relevant, particularly for tense phonations (lower R_d values) and/or high fundamental frequency values, F0s. Conversely, vowels with a lax phonation and/or low F0s may result in inaudible HFE levels, especially if aspiration noise is not considered in the glottal source model.

Idioma original	Anglès
Número d’article	4535
Revista	Applied Sciences (Switzerland)
Volum	9
Número	21
DOIs	https://doi.org/10.3390/app9214535
Estat de la publicació	Publicada - 1 de nov. 2019

Accés al document

10.3390/app9214535

Altres arxius i enllaços

Link to publication in Scopus

Com citar-ho

@article{95899ab4abe84261922bfc31b6755940,

title = "Glottal source contribution to higher order modes in the finite element synthesis of vowels",

abstract = "Articulatory speech synthesis has long been based on one-dimensional (1D) approaches. They assume plane wave propagation within the vocal tract and disregard higher order modes that typically appear above 5 kHz. However, such modes may be relevant in obtaining a more natural voice, especially for phonation types with significant high frequency energy (HFE) content. This work studies the contribution of the glottal source at high frequencies in the 3D numerical synthesis of vowels. The spoken vocal range is explored using an LF (Liljencrants-Fant) model enhanced with aspiration noise and controlled by the Rd glottal shape parameter. The vowels [A], [i], and [u] are generated with a finite element method (FEM) using realistic 3D vocal tract geometries obtained from magnetic resonance imaging (MRI), as well as simplified straight vocal tracts of a circular cross-sectional area. The symmetry of the latter prevents the onset of higher order modes. Thus, the comparison between realistic and simplified geometries enables us to analyse the influence of such modes. The simulations indicate that higher order modes may be perceptually relevant, particularly for tense phonations (lower Rd values) and/or high fundamental frequency values, F0s. Conversely, vowels with a lax phonation and/or low F0s may result in inaudible HFE levels, especially if aspiration noise is not considered in the glottal source model.",

keywords = "Finite element method, Glottal source, High frequency energy, Higher order modes, LF model, Numerical simulation, Voice production",

author = "Marc Freixes and Marc Arnela and Socor{\'o}, {Joan Claudi} and Francesc Al{\'i}as and Oriol Guasch",

note = "Funding Information: The authors are grateful to Saeed Dabbaghchian for the design of the vocal tract geometry simplifications and Lisa Kinnear for the English proofreading. This research was funded by the Agencia Estatal de Investigaci{\'o}n (AEI) and FEDER, EU, through project GENIOVOX TEC2016-81107-P. The fourth and fifth authors also acknowledge the support from the Obra Social {"}La Caixa{"} under respective grants ref. 2018-URL-IR2nQ-029 and 2018-URL-IR2nQ-031. Publisher Copyright: {\textcopyright} 2019 by the authors.",

year = "2019",

month = nov,

day = "1",

doi = "10.3390/app9214535",

language = "English",

volume = "9",

journal = "Applied Sciences (Switzerland)",

issn = "2076-3417",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "21",

}

TY - JOUR

T1 - Glottal source contribution to higher order modes in the finite element synthesis of vowels

AU - Freixes, Marc

AU - Arnela, Marc

AU - Socoró, Joan Claudi

AU - Alías, Francesc

AU - Guasch, Oriol

N1 - Funding Information: The authors are grateful to Saeed Dabbaghchian for the design of the vocal tract geometry simplifications and Lisa Kinnear for the English proofreading. This research was funded by the Agencia Estatal de Investigación (AEI) and FEDER, EU, through project GENIOVOX TEC2016-81107-P. The fourth and fifth authors also acknowledge the support from the Obra Social "La Caixa" under respective grants ref. 2018-URL-IR2nQ-029 and 2018-URL-IR2nQ-031. Publisher Copyright: © 2019 by the authors.

PY - 2019/11/1

Y1 - 2019/11/1

N2 - Articulatory speech synthesis has long been based on one-dimensional (1D) approaches. They assume plane wave propagation within the vocal tract and disregard higher order modes that typically appear above 5 kHz. However, such modes may be relevant in obtaining a more natural voice, especially for phonation types with significant high frequency energy (HFE) content. This work studies the contribution of the glottal source at high frequencies in the 3D numerical synthesis of vowels. The spoken vocal range is explored using an LF (Liljencrants-Fant) model enhanced with aspiration noise and controlled by the Rd glottal shape parameter. The vowels [A], [i], and [u] are generated with a finite element method (FEM) using realistic 3D vocal tract geometries obtained from magnetic resonance imaging (MRI), as well as simplified straight vocal tracts of a circular cross-sectional area. The symmetry of the latter prevents the onset of higher order modes. Thus, the comparison between realistic and simplified geometries enables us to analyse the influence of such modes. The simulations indicate that higher order modes may be perceptually relevant, particularly for tense phonations (lower Rd values) and/or high fundamental frequency values, F0s. Conversely, vowels with a lax phonation and/or low F0s may result in inaudible HFE levels, especially if aspiration noise is not considered in the glottal source model.

AB - Articulatory speech synthesis has long been based on one-dimensional (1D) approaches. They assume plane wave propagation within the vocal tract and disregard higher order modes that typically appear above 5 kHz. However, such modes may be relevant in obtaining a more natural voice, especially for phonation types with significant high frequency energy (HFE) content. This work studies the contribution of the glottal source at high frequencies in the 3D numerical synthesis of vowels. The spoken vocal range is explored using an LF (Liljencrants-Fant) model enhanced with aspiration noise and controlled by the Rd glottal shape parameter. The vowels [A], [i], and [u] are generated with a finite element method (FEM) using realistic 3D vocal tract geometries obtained from magnetic resonance imaging (MRI), as well as simplified straight vocal tracts of a circular cross-sectional area. The symmetry of the latter prevents the onset of higher order modes. Thus, the comparison between realistic and simplified geometries enables us to analyse the influence of such modes. The simulations indicate that higher order modes may be perceptually relevant, particularly for tense phonations (lower Rd values) and/or high fundamental frequency values, F0s. Conversely, vowels with a lax phonation and/or low F0s may result in inaudible HFE levels, especially if aspiration noise is not considered in the glottal source model.

KW - Finite element method

KW - Glottal source

KW - High frequency energy

KW - Higher order modes

KW - LF model

KW - Numerical simulation

KW - Voice production

UR - http://www.scopus.com/inward/record.url?scp=85075241330&partnerID=8YFLogxK

U2 - 10.3390/app9214535

DO - 10.3390/app9214535

M3 - Article

AN - SCOPUS:85075241330

SN - 2076-3417

VL - 9

JO - Applied Sciences (Switzerland)

JF - Applied Sciences (Switzerland)

IS - 21

M1 - 4535

ER -

Glottal source contribution to higher order modes in the finite element synthesis of vowels

Resum

Accés al document

Altres arxius i enllaços

Fingerprint

Com citar-ho