TY - CONF
T1 - Prosodic analysis of storytelling discourse modes and narrative situations oriented to Text-to-Speech synthesis
AU - Montaño, Raúl
AU - Alías, Francesc
AU - Ferrer, Josep
N1 - Funding Information:
The first author of this paper would like to acknowledge the support of the Catalan Government (SUR/ECO) for the predoctoral FI grant No. 2013FI_N 00790. We also thank Àngel Calzada and Dr. Joan Claudi Socoró for their support in the HNM synthesis implementation.
Publisher Copyright:
© SSW 2013. All rights reserved.
PY - 2013
Y1 - 2013
N2 - The generation of synthetic speech with a certain degree of expressiveness has been successful for some particular applications or speaking styles (e.g. emotions). In this context, there is a particular speaking style with subtle speech nuances that may be of great interest for delivering expressive speech: the storytelling style. The purpose of this paper is to define a first step towards developing a storytelling Text-to-Speech (TTS) synthesis system by means of modelling the specific prosodic patterns (pitch, intensity and tempo) of this speaking style. We base our analysis of a tale in Spanish on discourse modes present in storytelling: narrative, descriptive and dialogue. Moreover, we introduce narrative situations (neutral narrative, post-character, suspense and affective situations) within the narrative mode, which are analysed at the sentence level. After grouping the sentences into modes and narrative situations, we analyse their corresponding prosodic patterns both objectively (via statistical tests) and subjectively (via perceptual test considering resynthesized sentences). The results show that the statistically validated prosodic rules perform equally (or even better) than the original prosody in most sentences.
AB - The generation of synthetic speech with a certain degree of expressiveness has been successful for some particular applications or speaking styles (e.g. emotions). In this context, there is a particular speaking style with subtle speech nuances that may be of great interest for delivering expressive speech: the storytelling style. The purpose of this paper is to define a first step towards developing a storytelling Text-to-Speech (TTS) synthesis system by means of modelling the specific prosodic patterns (pitch, intensity and tempo) of this speaking style. We base our analysis of a tale in Spanish on discourse modes present in storytelling: narrative, descriptive and dialogue. Moreover, we introduce narrative situations (neutral narrative, post-character, suspense and affective situations) within the narrative mode, which are analysed at the sentence level. After grouping the sentences into modes and narrative situations, we analyse their corresponding prosodic patterns both objectively (via statistical tests) and subjectively (via perceptual test considering resynthesized sentences). The results show that the statistically validated prosodic rules perform equally (or even better) than the original prosody in most sentences.
KW - Harmonic plus Noise Model
KW - Storytelling
KW - TTS
KW - narrative situations
KW - prosodic analysis
UR - http://www.scopus.com/inward/record.url?scp=85133203199&partnerID=8YFLogxK
M3 - Contribution
AN - SCOPUS:85133203199
SP - 171
EP - 176
T2 - 8th ISCA Tutorial and Research Workshop on Speech Synthesis, SSW 2013
Y2 - 31 August 2013 through 2 September 2013
ER -