TY - CONF
T1 - Evolutionary weight tuning based on diphone pairs for unit selection speech synthesis
AU - Alías, Francesc
AU - Llorà, Xavier
N1 - Funding Information:
We would truly thank Ignasi Iriondo for many useful discussion during the preparation of this manuscript. We would also like to thank the Generalitat de Catalunya the D.U.R.S.I. for their support under grant number 2000FI-00679. This work was also supported by the Technology Research, Education and Commercialization Center (TRECC), a program of the University of Illinois at Urbana-Champaign, administered by the National Center for Supercomputing Applications (NCSA) and funded by the Office of Naval Research under grant N00014-01-1-0175. We would also like to thank the Air Force Office of Scientific Research, Air Force Materiel Command, USAF, under grant F49620-00-0163. Research funding for this work was also provided by a grant from the National Science Foundation under grant DMI-9908252.
PY - 2003
Y1 - 2003
N2 - Unit selection text-to-speech (TTS) conversion is an ongoing research for the speech synthesis community. This paper is focused on tuning the weights involved in the target concatenation cost metrics. We propose a method for automatically adjusting these weights simultaneously by means of diphone and triphone pairs. This method is based on techniques provided by the evolutionary computation community, taking advantage of their robustness in noisy domains. The experiments and their analyses demonstrate its good performance in this problem, thus, overcoming some constraints assumed by previous works leading to a new interesting framework for further investigations.
AB - Unit selection text-to-speech (TTS) conversion is an ongoing research for the speech synthesis community. This paper is focused on tuning the weights involved in the target concatenation cost metrics. We propose a method for automatically adjusting these weights simultaneously by means of diphone and triphone pairs. This method is based on techniques provided by the evolutionary computation community, taking advantage of their robustness in noisy domains. The experiments and their analyses demonstrate its good performance in this problem, thus, overcoming some constraints assumed by previous works leading to a new interesting framework for further investigations.
UR - http://www.scopus.com/inward/record.url?scp=85009210634&partnerID=8YFLogxK
M3 - Contribution
AN - SCOPUS:85009210634
SP - 1333
EP - 1336
T2 - 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003
Y2 - 1 September 2003 through 4 September 2003
ER -