Realistic facial animation system for interactive services

Kang Liu; Joern Ostermann

doi:10.21437/Interspeech.2008-594

Details

Original language	English
Pages (from-to)	2330-2333
Number of pages	4
Journal	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication status	Published - 2008
Event	INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, Australia Duration: 22 Sept 2008 → 26 Sept 2008

Abstract

This paper presents the optimization of parameters of talking head for web-based applications with a talking head, such as Newsreader and E-commerce, in which the realistic talking head initiates a conversation with users. Our talking head system includes two parts: analysis and synthesis. The audiovisual analysis part creates a face model of a recorded human subject, which is composed of a personalized 3D mask as well as a large database of mouth images and their related information. The synthesis part generates facial animation by concatenating appropriate mouth images from the database. A critical issue of the synthesis is the unit selection which selects these appropriate mouth images from the database such that they match the spoken words of the talking head. In order to achieve a realistic facial animation, the unit selection has to be optimized. Objective criteria are proposed in this paper and the Pareto optimization is used to train the unit selection. Subjective tests are carried out in our web-based evaluation system. Experimental results show that most people cannot distinguish our facial animations from real videos.

Keywords

Pareto optimization, Talking head, TTS (Text-to-Speech), Unit selection

ASJC Scopus subject areas

Computer Science(all)
Human-Computer Interaction
Computer Science(all)
Signal Processing
Computer Science(all)
Software
Neuroscience(all)
Sensory Systems

Cite this

Realistic facial animation system for interactive services. / Liu, Kang; Ostermann, Joern.
In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2008, p. 2330-2333.

Research output: Contribution to journal › Conference article › Research › peer review

Liu, K & Ostermann, J 2008, 'Realistic facial animation system for interactive services', Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 2330-2333. https://doi.org/10.21437/Interspeech.2008-594

Liu, K., & Ostermann, J. (2008). Realistic facial animation system for interactive services. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2330-2333. https://doi.org/10.21437/Interspeech.2008-594

Liu K, Ostermann J. Realistic facial animation system for interactive services. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2008;2330-2333. doi: 10.21437/Interspeech.2008-594

Liu, Kang ; Ostermann, Joern. / Realistic facial animation system for interactive services. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2008 ; pp. 2330-2333.

Download

@article{7ffadf38f60f44d8a2e6ad86da9214e1,

title = "Realistic facial animation system for interactive services",

abstract = "This paper presents the optimization of parameters of talking head for web-based applications with a talking head, such as Newsreader and E-commerce, in which the realistic talking head initiates a conversation with users. Our talking head system includes two parts: analysis and synthesis. The audiovisual analysis part creates a face model of a recorded human subject, which is composed of a personalized 3D mask as well as a large database of mouth images and their related information. The synthesis part generates facial animation by concatenating appropriate mouth images from the database. A critical issue of the synthesis is the unit selection which selects these appropriate mouth images from the database such that they match the spoken words of the talking head. In order to achieve a realistic facial animation, the unit selection has to be optimized. Objective criteria are proposed in this paper and the Pareto optimization is used to train the unit selection. Subjective tests are carried out in our web-based evaluation system. Experimental results show that most people cannot distinguish our facial animations from real videos.",

keywords = "Pareto optimization, Talking head, TTS (Text-to-Speech), Unit selection",

author = "Kang Liu and Joern Ostermann",

year = "2008",

doi = "10.21437/Interspeech.2008-594",

language = "English",

pages = "2330--2333",

note = "INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association ; Conference date: 22-09-2008 Through 26-09-2008",

}

Download

TY - JOUR

T1 - Realistic facial animation system for interactive services

AU - Liu, Kang

AU - Ostermann, Joern

PY - 2008

Y1 - 2008

N2 - This paper presents the optimization of parameters of talking head for web-based applications with a talking head, such as Newsreader and E-commerce, in which the realistic talking head initiates a conversation with users. Our talking head system includes two parts: analysis and synthesis. The audiovisual analysis part creates a face model of a recorded human subject, which is composed of a personalized 3D mask as well as a large database of mouth images and their related information. The synthesis part generates facial animation by concatenating appropriate mouth images from the database. A critical issue of the synthesis is the unit selection which selects these appropriate mouth images from the database such that they match the spoken words of the talking head. In order to achieve a realistic facial animation, the unit selection has to be optimized. Objective criteria are proposed in this paper and the Pareto optimization is used to train the unit selection. Subjective tests are carried out in our web-based evaluation system. Experimental results show that most people cannot distinguish our facial animations from real videos.

AB - This paper presents the optimization of parameters of talking head for web-based applications with a talking head, such as Newsreader and E-commerce, in which the realistic talking head initiates a conversation with users. Our talking head system includes two parts: analysis and synthesis. The audiovisual analysis part creates a face model of a recorded human subject, which is composed of a personalized 3D mask as well as a large database of mouth images and their related information. The synthesis part generates facial animation by concatenating appropriate mouth images from the database. A critical issue of the synthesis is the unit selection which selects these appropriate mouth images from the database such that they match the spoken words of the talking head. In order to achieve a realistic facial animation, the unit selection has to be optimized. Objective criteria are proposed in this paper and the Pareto optimization is used to train the unit selection. Subjective tests are carried out in our web-based evaluation system. Experimental results show that most people cannot distinguish our facial animations from real videos.

KW - Pareto optimization

KW - Talking head

KW - TTS (Text-to-Speech)

KW - Unit selection

UR - http://www.scopus.com/inward/record.url?scp=84867227937&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2008-594

DO - 10.21437/Interspeech.2008-594

M3 - Conference article

AN - SCOPUS:84867227937

SP - 2330

EP - 2333

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

SN - 2308-457X

T2 - INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association

Y2 - 22 September 2008 through 26 September 2008

ER -

Research@Leibniz University

Realistic facial animation system for interactive services

Authors

Research Organisations

Details

Abstract

Keywords

ASJC Scopus subject areas

Cite this

By the same author(s)

Inverse design of robust out-of-plane coupling elements

Wire Break Detection in Hybrid Towers of Wind Turbines: A Novel Application to Monitor Tendons Using Acoustic Emission Analysis

Quantized Inverse Design for Photonic Integrated Circuits

Towards Automatic Bias Analysis in Multimedia Journalism

A flexible framework for large-scale FDTD simulations: open-source inverse design for 3D nanostructures

Inverse design of robust out-of-plane coupling elements

Wire Break Detection in Hybrid Towers of Wind Turbines: A Novel Application to Monitor Tendons Using Acoustic Emission Analysis

Quantized Inverse Design for Photonic Integrated Circuits

Towards Automatic Bias Analysis in Multimedia Journalism

A flexible framework for large-scale FDTD simulations: open-source inverse design for 3D nanostructures

Inverse design of robust out-of-plane coupling elements