Gesture synthesis adapted to speech emphasis

Adso Fernández-Baena, Raúl Montaño, Marc Antonijoan, Arturo Roversi, David Miralles, Francesc Alías

Research output: Indexed journal article Articlepeer-review

21 Citations (Scopus)

Abstract

Avatars communicate through speech and gestures to appear realistic and to enhance interaction with humans. In this context, several works have analyzed the relationship between speech and gestures, while others have been focused on their synthesis, following different approaches. In this work, we address both goals by linking speech to gestures in terms of time and intensity, to then use this knowledge to drive a gesture synthesizer from a manually annotated speech signal. To that effect, we define strength indicators for speech and motion. After validating them through perceptual tests, we obtain an intensity rule from their correlation. Moreover, we derive a synchrony rule to determine temporal correspondences between speech and gestures. These analyses have been conducted on aggressive and neutral performances to cover a broad range of emphatic levels, whose speech signal and motion have been manually annotated. Next, intensity and synchrony rules are used to drive a gesture synthesizer called gesture motion graph (GMG). These rules are validated by users from GMG output animations through perceptual tests. Results show that animations using intensity and synchrony rules perform better than those only using the synchrony rule (which in turn enhance realism with respect to random animation). Finally, we conclude that the extracted rules allow GMG to properly synthesize gestures adapted to speech emphasis from annotated speech.

Original languageEnglish
Pages (from-to)331-350
Number of pages20
JournalSpeech Communication
Volume57
DOIs
Publication statusPublished - 2014

Keywords

  • Body language
  • Character animation
  • Human computer interaction
  • Motion capture
  • Motion graphs
  • Speech analysis
  • Speech emphasis

Fingerprint

Dive into the research topics of 'Gesture synthesis adapted to speech emphasis'. Together they form a unique fingerprint.

Cite this