Linguistic and Mixed Excitation Improvements on a HMM-based speech synthesis for Castilian Spanish

Xavier Gonzalvo, Joan Claudi Socoró, Ignasi Iriondo, Carlos Monzo, Elisa Martínez

Producció científica: Contribució a una conferènciaContribucióAvaluat per experts

13 Cites (Scopus)

Resum

Hidden Markov Models based text-to-speech(HMM-TTS) synthesis is one of the techniques for generating speech from trained statistical models where spectrum and prosody of basic speech units are modelled altogether. This paper presents the advances in our Spanish HMM-TTS and a perceptual test is conducted to compare it with an extended PSOLA-based concatenative (E-PSOLA) system. The improvements have been performed on phonetic information and contextual factors according to the Castilian Spanish language and speech generation using a mixed excitation (ME) technique. The results show the preference of the new HMM-TTS system in front of the previous system and a better MOS in comparison with a real E-PSOLA in terms of acceptability, intelligibility and stability.

Idioma originalAnglès
Pàgines362-367
Nombre de pàgines6
Estat de la publicacióPublicada - 2007
Esdeveniment6th ISCA Tutorial and Research Workshop on Speech Synthesis, SSW 2007 - Bonn, Germany
Durada: 22 d’ag. 200724 d’ag. 2007

Conferència

Conferència6th ISCA Tutorial and Research Workshop on Speech Synthesis, SSW 2007
País/TerritoriGermany
CiutatBonn
Període22/08/0724/08/07

Fingerprint

Navegar pels temes de recerca de 'Linguistic and Mixed Excitation Improvements on a HMM-based speech synthesis for Castilian Spanish'. Junts formen un fingerprint únic.

Com citar-ho