Mixing HMM-based spanish speech synthesis with a CBR for prosody estimation

Xavi Gonzalvo; Ignasi Iriondo; Joan Claudi Socoró; Francesc Alias; Carlos Monzo

doi:10.1007/978-3-540-77347-4_4

Mixing HMM-based spanish speech synthesis with a CBR for prosody estimation

Xavi Gonzalvo, Ignasi Iriondo, Joan Claudi Socoró, Francesc Alias, Carlos Monzo

Producció científica: Capítol de llibre › Contribució a congrés/conferència › Avaluat per experts

Resum

Hidden Markov Models based text-to-speech (HMM-TTS) synthesis is a technique for generating speech from trained statistical models where spectrum, pitch and durations of basic speech units are modelled altogether. The aim of this work is to describe a Spanish HMMTTS system using an external machine learning technique to help improving the expressiveness. System performance is analysed objectively and subjectively. The experiments were conducted on a reliably labelled speech corpus, whose units were clustered using contextual factors based on the Spanish language. The results show that the CBR-based F0 estimation is capable of improving the HMM-based baseline performance when synthesizing non-declarative short sentences while the durations accuracy is similar with the CBR. or the HMM system.

Idioma original	Anglès
Títol de la publicació	Advances in Nonlinear Speech Processing - International Conference on Nonlinear Speech Processing, NOLISP 2007, Revised Selected Papers
Editor	Springer Verlag
Pàgines	78-85
Nombre de pàgines	8
ISBN (imprès)	3540773460, 9783540773467
DOIs	https://doi.org/10.1007/978-3-540-77347-4_4
Estat de la publicació	Publicada - 2007
Esdeveniment	International Conference on Nonlinear Speech Processing, NOLISP 2007 - Paris, France Durada: 22 de maig 2007 → 25 de maig 2007

Sèrie de publicacions

Nom	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volum	4885 LNAI
ISSN (imprès)	0302-9743
ISSN (electrònic)	1611-3349

Conferència

Conferència	International Conference on Nonlinear Speech Processing, NOLISP 2007
País/Territori	France
Ciutat	Paris
Període	22/05/07 → 25/05/07

Accés al document

10.1007/978-3-540-77347-4_4

Altres arxius i enllaços

Link to publication in Scopus

Com citar-ho

Gonzalvo, X., Iriondo, I., Socoró, J. C., Alias, F., & Monzo, C. (2007). Mixing HMM-based spanish speech synthesis with a CBR for prosody estimation. In Advances in Nonlinear Speech Processing - International Conference on Nonlinear Speech Processing, NOLISP 2007, Revised Selected Papers (pàg. 78-85). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4885 LNAI). Springer Verlag. https://doi.org/10.1007/978-3-540-77347-4_4

Gonzalvo, Xavi ; Iriondo, Ignasi ; Socoró, Joan Claudi et al. / Mixing HMM-based spanish speech synthesis with a CBR for prosody estimation. Advances in Nonlinear Speech Processing - International Conference on Nonlinear Speech Processing, NOLISP 2007, Revised Selected Papers. Springer Verlag, 2007. pàg. 78-85 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{56c6b831bd3241179bcd4b62da7e7e47,

title = "Mixing HMM-based spanish speech synthesis with a CBR for prosody estimation",

abstract = "Hidden Markov Models based text-to-speech (HMM-TTS) synthesis is a technique for generating speech from trained statistical models where spectrum, pitch and durations of basic speech units are modelled altogether. The aim of this work is to describe a Spanish HMMTTS system using an external machine learning technique to help improving the expressiveness. System performance is analysed objectively and subjectively. The experiments were conducted on a reliably labelled speech corpus, whose units were clustered using contextual factors based on the Spanish language. The results show that the CBR-based F0 estimation is capable of improving the HMM-based baseline performance when synthesizing non-declarative short sentences while the durations accuracy is similar with the CBR. or the HMM system.",

author = "Xavi Gonzalvo and Ignasi Iriondo and Socor{\'o}, {Joan Claudi} and Francesc Alias and Carlos Monzo",

year = "2007",

doi = "10.1007/978-3-540-77347-4_4",

language = "English",

isbn = "3540773460",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "78--85",

booktitle = "Advances in Nonlinear Speech Processing - International Conference on Nonlinear Speech Processing, NOLISP 2007, Revised Selected Papers",

address = "Germany",

note = "International Conference on Nonlinear Speech Processing, NOLISP 2007 ; Conference date: 22-05-2007 Through 25-05-2007",

}

Gonzalvo, X, Iriondo, I , Socoró, JC , Alias, F & Monzo, C 2007, Mixing HMM-based spanish speech synthesis with a CBR for prosody estimation. in Advances in Nonlinear Speech Processing - International Conference on Nonlinear Speech Processing, NOLISP 2007, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4885 LNAI, Springer Verlag, pàg. 78-85, International Conference on Nonlinear Speech Processing, NOLISP 2007, Paris, France, 22/05/07. https://doi.org/10.1007/978-3-540-77347-4_4

Mixing HMM-based spanish speech synthesis with a CBR for prosody estimation. / Gonzalvo, Xavi; Iriondo, Ignasi ; Socoró, Joan Claudi et al.
Advances in Nonlinear Speech Processing - International Conference on Nonlinear Speech Processing, NOLISP 2007, Revised Selected Papers. Springer Verlag, 2007. pàg. 78-85 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4885 LNAI).

Producció científica: Capítol de llibre › Contribució a congrés/conferència › Avaluat per experts

TY - GEN

T1 - Mixing HMM-based spanish speech synthesis with a CBR for prosody estimation

AU - Gonzalvo, Xavi

AU - Iriondo, Ignasi

AU - Socoró, Joan Claudi

AU - Alias, Francesc

AU - Monzo, Carlos

PY - 2007

Y1 - 2007

N2 - Hidden Markov Models based text-to-speech (HMM-TTS) synthesis is a technique for generating speech from trained statistical models where spectrum, pitch and durations of basic speech units are modelled altogether. The aim of this work is to describe a Spanish HMMTTS system using an external machine learning technique to help improving the expressiveness. System performance is analysed objectively and subjectively. The experiments were conducted on a reliably labelled speech corpus, whose units were clustered using contextual factors based on the Spanish language. The results show that the CBR-based F0 estimation is capable of improving the HMM-based baseline performance when synthesizing non-declarative short sentences while the durations accuracy is similar with the CBR. or the HMM system.

AB - Hidden Markov Models based text-to-speech (HMM-TTS) synthesis is a technique for generating speech from trained statistical models where spectrum, pitch and durations of basic speech units are modelled altogether. The aim of this work is to describe a Spanish HMMTTS system using an external machine learning technique to help improving the expressiveness. System performance is analysed objectively and subjectively. The experiments were conducted on a reliably labelled speech corpus, whose units were clustered using contextual factors based on the Spanish language. The results show that the CBR-based F0 estimation is capable of improving the HMM-based baseline performance when synthesizing non-declarative short sentences while the durations accuracy is similar with the CBR. or the HMM system.

UR - http://www.scopus.com/inward/record.url?scp=38549163738&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-77347-4_4

DO - 10.1007/978-3-540-77347-4_4

M3 - Conference contribution

AN - SCOPUS:38549163738

SN - 3540773460

SN - 9783540773467

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 78

EP - 85

BT - Advances in Nonlinear Speech Processing - International Conference on Nonlinear Speech Processing, NOLISP 2007, Revised Selected Papers

PB - Springer Verlag

T2 - International Conference on Nonlinear Speech Processing, NOLISP 2007

Y2 - 22 May 2007 through 25 May 2007

ER -

Gonzalvo X, Iriondo I , Socoró JC , Alias F, Monzo C. Mixing HMM-based spanish speech synthesis with a CBR for prosody estimation. In Advances in Nonlinear Speech Processing - International Conference on Nonlinear Speech Processing, NOLISP 2007, Revised Selected Papers. Springer Verlag. 2007. pàg. 78-85. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-540-77347-4_4

Mixing HMM-based spanish speech synthesis with a CBR for prosody estimation

Resum

Sèrie de publicacions

Conferència

Accés al document

Altres arxius i enllaços

Fingerprint

Com citar-ho