Mixing HMM-based spanish speech synthesis with a CBR for prosody estimation

Xavi Gonzalvo, Ignasi Iriondo, Joan Claudi Socoró, Francesc Alias, Carlos Monzo

Producción científica: Capítulo del libroContribución a congreso/conferenciarevisión exhaustiva

Resumen

Hidden Markov Models based text-to-speech (HMM-TTS) synthesis is a technique for generating speech from trained statistical models where spectrum, pitch and durations of basic speech units are modelled altogether. The aim of this work is to describe a Spanish HMMTTS system using an external machine learning technique to help improving the expressiveness. System performance is analysed objectively and subjectively. The experiments were conducted on a reliably labelled speech corpus, whose units were clustered using contextual factors based on the Spanish language. The results show that the CBR-based F0 estimation is capable of improving the HMM-based baseline performance when synthesizing non-declarative short sentences while the durations accuracy is similar with the CBR. or the HMM system.

Idioma originalInglés
Título de la publicación alojadaAdvances in Nonlinear Speech Processing - International Conference on Nonlinear Speech Processing, NOLISP 2007, Revised Selected Papers
EditorialSpringer Verlag
Páginas78-85
Número de páginas8
ISBN (versión impresa)3540773460, 9783540773467
DOI
EstadoPublicada - 2007
EventoInternational Conference on Nonlinear Speech Processing, NOLISP 2007 - Paris, Francia
Duración: 22 may 200725 may 2007

Serie de la publicación

NombreLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen4885 LNAI
ISSN (versión impresa)0302-9743
ISSN (versión digital)1611-3349

Conferencia

ConferenciaInternational Conference on Nonlinear Speech Processing, NOLISP 2007
País/TerritorioFrancia
CiudadParis
Período22/05/0725/05/07

Huella

Profundice en los temas de investigación de 'Mixing HMM-based spanish speech synthesis with a CBR for prosody estimation'. En conjunto forman una huella única.

Citar esto