Perception-guided and phonetic clustering weight tuning based on diphone pairs for unit selection TTS

4 Cites (Scopus)


The quality of corpus based text-to-speech systems depends on the accuracy of the unit selection process, which relies on the values of the weights of the cost function. This paper is focused on defining a new framework for the tuning of these weights. We propose a technique for taking into account the subjective perception of speech in the selection process by means of Interactive Genetic Algorithms. Moreover, we introduce a CART-based method for unit clustering. Both techniques are applied to weight tuning based on diphone pairs. The conducted experiments analyze the feasibility of both proposals separately.

Idioma originalAnglès
Nombre de pàgines4
Estat de la publicacióPublicada - 2004
Esdeveniment8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Durada: 4 d’oct. 20048 d’oct. 2004


Conferència8th International Conference on Spoken Language Processing, ICSLP 2004
País/TerritoriKorea, Republic of
CiutatJeju, Jeju Island


