Parametric model for vocal effort interpolation with Harmonics Plus Noise Models

Àngel Calzada Defez, Joan Claudi Socoró Carrié, Robert A.J. Clark

Producció científica: Contribució a una conferènciaContribucióAvaluat per experts

2 Cites (Scopus)

Resum

It is known that voice quality plays an important role in expressive speech. In this paper, we present a methodology for modifying vocal effort level, which can be applied by text-to-speech (TTS) systems to provide the flexibility needed to improve the naturalness of synthesized speech. This extends previous work using low order Linear Prediction Coefficients (LPC) where the flexibility was constrained by the amount of vocal effort levels available in the corpora. The proposed methodology overcomes these limitations by replacing the low order LPC by ninth order polynomials to allow not only vocal effort to be modified towards the available templates, but also to allow the generation of intermediate vocal effort levels between levels available in training data. This flexibility comes from the combination of Harmonics plus Noise Models and using a parametric model to represent the spectral envelope. The conducted perceptual tests demonstrate the effectiveness of the proposed technique in performing vocal effort interpolations while maintaining the signal quality in the final synthesis. The proposed technique can be used in unit-selection TTS systems to reduce corpus size while increasing its flexibility, and the techniques could potentially be employed by HMM based speech synthesis systems if appropriate acoustic features are being used.

Idioma originalAnglès
Pàgines25-30
Nombre de pàgines6
Estat de la publicacióPublicada - 2013
Esdeveniment8th ISCA Tutorial and Research Workshop on Speech Synthesis, SSW 2013 - Barcelona, Spain
Durada: 31 d’ag. 20132 de set. 2013

Conferència

Conferència8th ISCA Tutorial and Research Workshop on Speech Synthesis, SSW 2013
País/TerritoriSpain
CiutatBarcelona
Període31/08/132/09/13

Fingerprint

Navegar pels temes de recerca de 'Parametric model for vocal effort interpolation with Harmonics Plus Noise Models'. Junts formen un fingerprint únic.

Com citar-ho