Emphatic visual speech synthesis

  • Javier Melenchón*
  • , Elisa Martínez
  • , Fernando De La Torre
  • , José A. Montero
  • *Corresponding author for this work

Research output: Indexed journal article Articlepeer-review

13 Citations (Scopus)

Abstract

The synthesis of talking heads has been a flourishing, , research area over the last few years. Since human beings have, , an uncanny ability to read people's faces, most related applications, , (e.g., advertising, video-teleconferencing) require absolutely, , realistic photometric and behavioral synthesis of faces. This paper, , proposes a person-specific facial synthesis framework that allows, , high realism and includes a novel way to control visual emphasis, , (e.g., level of exaggeration of visible articulatory movements of the, , vocal tract). There are three main contributions: a geodesic interpolation, , with visual unit selection, a parameterization of visual emphasis, , , and the design of minimum size corpora. Perceptual tests, , with human subjects reveal high realism properties, achieving similar, , perceptual scores as real samples. Furthermore, the visual emphasis, , level and two communication styles show a statistical interaction, , relationship.

Original languageEnglish
Pages (from-to)459-468
Number of pages10
JournalIEEE Transactions on Audio, Speech and Language Processing
Volume17
Issue number3
DOIs
Publication statusPublished - Mar 2009

Keywords

  • Audiovisual speech synthesis
  • Emphatic visual-speech
  • Talking head

Fingerprint

Dive into the research topics of 'Emphatic visual speech synthesis'. Together they form a unique fingerprint.

Cite this