Perception-guided and phonetic clustering weight tuning based on diphone pairs for unit selection TTS

Producció científica: Contribució a una conferènciaContribucióAvaluat per experts

4 Cites (Scopus)

Resum

The quality of corpus based text-to-speech systems depends on the accuracy of the unit selection process, which relies on the values of the weights of the cost function. This paper is focused on defining a new framework for the tuning of these weights. We propose a technique for taking into account the subjective perception of speech in the selection process by means of Interactive Genetic Algorithms. Moreover, we introduce a CART-based method for unit clustering. Both techniques are applied to weight tuning based on diphone pairs. The conducted experiments analyze the feasibility of both proposals separately.

Idioma originalAnglès
Pàgines1221-1224
Nombre de pàgines4
Estat de la publicacióPublicada - 2004
Esdeveniment8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Durada: 4 d’oct. 20048 d’oct. 2004

Conferència

Conferència8th International Conference on Spoken Language Processing, ICSLP 2004
País/TerritoriKorea, Republic of
CiutatJeju, Jeju Island
Període4/10/048/10/04

Fingerprint

Navegar pels temes de recerca de 'Perception-guided and phonetic clustering weight tuning based on diphone pairs for unit selection TTS'. Junts formen un fingerprint únic.

Com citar-ho