Finite Element Synthesis of Diphthongs Using Tuned Two-Dimensional Vocal Tracts

Research output: Indexed journal article Articlepeer-review

14 Citations (Scopus)


Three-dimensional (3-D) vocal tract acoustic modeling has the potential to generate high quality and natural voice sounds, but at the price of a large computational cost. Alternatively, 2-D models based on tuned vocal tracts have shown to provide similar results to the 3-D ones but with less computational demands. However, they are currently limited to the synthesis of static vowel sounds. In this paper, the tuned 2-D approach is extended by considering moving vocal tracts to generate dynamic vowel sounds, like diphthongs. Four tuning steps are followed to build a dynamic 2-D vocal tract model that can recover, to a large extent, the formant locations, bandwidths, and energies of a 3-D vocal tract with circular cross section, set in a spherical baffle representing the human head. Acoustic waves propagating through the time evolving vocal tract and radiating to free-field are simulated using the finite element method in the time-domain. As examples, the diphthongs [α i] and [αu] have been generated using the tuning approach and compared, by means of objective and subjective evaluations, to those resulting from 3-D and conventional 2-D simulations.

Original languageEnglish
Pages (from-to)2013-2023
Number of pages11
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Issue number10
Publication statusPublished - Oct 2017


  • Diphthong synthesis
  • finite element method
  • speech synthesis
  • tuned two-dimensional vocal tracts
  • vocal tract acoustics


Dive into the research topics of 'Finite Element Synthesis of Diphthongs Using Tuned Two-Dimensional Vocal Tracts'. Together they form a unique fingerprint.

Cite this