Resum
Hidden Markov Models based text-to-speech (HMM-TTS) synthesis is a technique for generating speech from trained statistical models where spectrum, pitch and durations of basic speech units are modelled altogether. The aim of this work is to describe a Spanish HMMTTS system using an external machine learning technique to help improving the expressiveness. System performance is analysed objectively and subjectively. The experiments were conducted on a reliably labelled speech corpus, whose units were clustered using contextual factors based on the Spanish language. The results show that the CBR-based F0 estimation is capable of improving the HMM-based baseline performance when synthesizing non-declarative short sentences while the durations accuracy is similar with the CBR. or the HMM system.
Idioma original | Anglès |
---|---|
Títol de la publicació | Advances in Nonlinear Speech Processing - International Conference on Nonlinear Speech Processing, NOLISP 2007, Revised Selected Papers |
Editor | Springer Verlag |
Pàgines | 78-85 |
Nombre de pàgines | 8 |
ISBN (imprès) | 3540773460, 9783540773467 |
DOIs | |
Estat de la publicació | Publicada - 2007 |
Esdeveniment | International Conference on Nonlinear Speech Processing, NOLISP 2007 - Paris, France Durada: 22 de maig 2007 → 25 de maig 2007 |
Sèrie de publicacions
Nom | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volum | 4885 LNAI |
ISSN (imprès) | 0302-9743 |
ISSN (electrònic) | 1611-3349 |
Conferència
Conferència | International Conference on Nonlinear Speech Processing, NOLISP 2007 |
---|---|
País/Territori | France |
Ciutat | Paris |
Període | 22/05/07 → 25/05/07 |