Abstract
The synthesis of talking heads has been a flourishing, , research area over the last few years. Since human beings have, , an uncanny ability to read people's faces, most related applications, , (e.g., advertising, video-teleconferencing) require absolutely, , realistic photometric and behavioral synthesis of faces. This paper, , proposes a person-specific facial synthesis framework that allows, , high realism and includes a novel way to control visual emphasis, , (e.g., level of exaggeration of visible articulatory movements of the, , vocal tract). There are three main contributions: a geodesic interpolation, , with visual unit selection, a parameterization of visual emphasis, , , and the design of minimum size corpora. Perceptual tests, , with human subjects reveal high realism properties, achieving similar, , perceptual scores as real samples. Furthermore, the visual emphasis, , level and two communication styles show a statistical interaction, , relationship.
| Original language | English |
|---|---|
| Pages (from-to) | 459-468 |
| Number of pages | 10 |
| Journal | IEEE Transactions on Audio, Speech and Language Processing |
| Volume | 17 |
| Issue number | 3 |
| DOIs | |
| Publication status | Published - Mar 2009 |
Keywords
- Audiovisual speech synthesis
- Emphatic visual-speech
- Talking head
Fingerprint
Dive into the research topics of 'Emphatic visual speech synthesis'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver