TY - GEN
T1 - High quality emotional HMM-based synthesis in Spanish
AU - Gonzalvo, Xavi
AU - Taylor, Paul
AU - Monzo, Carlos
AU - Iriondo, Ignasi
AU - Socoró, Joan Claudi
N1 - Funding Information:
The experiments with memory-based learning techniques described in Section 6 were carried out with the help of Antal van den Bosch. Thanks are also due to Bob Carpenter for singling out an annoying error in an earlier version of this paper (presented at EUROSPEECH 1999 in Budapest), and to the three anonymous referees for various useful comments. The authors are mentioned in alphabetical order. Weegels and Theune were supported by the Priority Programme Language and Speech Technology (TST), sponsored by NWO (The Netherlands Organization for Scientific Research). Swerts is also affiliated with the FWO— Flanders. Krahmer was partly supported by the project LE-1 2277 (VODIS).
PY - 2010
Y1 - 2010
N2 - This paper describes a high-quality Spanish HMM-based speech synthesis of emotional speaking styles. The quality of the HMM-based speech synthesis is enhanced by using the most recent features presented for the Blizzard system (i.e. STRAIGHT spectrum extraction and mixed excitation). Two techniques are evaluated. First, a method simultaneously model all emotions within a single acoustic model. Second, an adaptation techniques to convert a neutral emotional style to a target emotion. We consider 3 kinds of emotions expressions: neutral, happy and sad. A subjective evaluation will show the quality of the system and the intensity of the produced emotion while an objective evaluation based on voice quality parameters evaluates the effectiveness of the approaches.
AB - This paper describes a high-quality Spanish HMM-based speech synthesis of emotional speaking styles. The quality of the HMM-based speech synthesis is enhanced by using the most recent features presented for the Blizzard system (i.e. STRAIGHT spectrum extraction and mixed excitation). Two techniques are evaluated. First, a method simultaneously model all emotions within a single acoustic model. Second, an adaptation techniques to convert a neutral emotional style to a target emotion. We consider 3 kinds of emotions expressions: neutral, happy and sad. A subjective evaluation will show the quality of the system and the intensity of the produced emotion while an objective evaluation based on voice quality parameters evaluates the effectiveness of the approaches.
KW - Adaptation
KW - Emotion
KW - HMM-based speech synthesis
UR - http://www.scopus.com/inward/record.url?scp=77951472390&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-11509-7_4
DO - 10.1007/978-3-642-11509-7_4
M3 - Conference contribution
AN - SCOPUS:77951472390
SN - 364211508X
SN - 9783642115080
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 26
EP - 34
BT - Advances in Nonlinear Speech Processing - International Conference on Nonlinear Speech Processing, NOLISP 2009, Revised Selected Papers
T2 - International Conference on Nonlinear Speech Processing, NOLISP 2009
Y2 - 25 June 2009 through 27 June 2009
ER -