A Talking Head System for Korean Text

Sang-Wan Kim; Hoon Lee; Kyung-Ho Choi; Soon-Young Park

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33132

A Talking Head System for Korean Text

Authors: Sang-Wan Kim, Hoon Lee, Kyung-Ho Choi, Soon-Young Park

Abstract:

A talking head system (THS) is presented to animate the face of a speaking 3D avatar in such a way that it realistically pronounces the given Korean text. The proposed system consists of SAPI compliant text-to-speech (TTS) engine and MPEG-4 compliant face animation generator. The input to the THS is a unicode text that is to be spoken with synchronized lip shape. The TTS engine generates a phoneme sequence with their duration and audio data. The TTS applies the coarticulation rules to the phoneme sequence and sends a mouth animation sequence to the face modeler. The proposed THS can make more natural lip sync and facial expression by using the face animation generator than those using the conventional visemes only. The experimental results show that our system has great potential for the implementation of talking head for Korean text.

Keywords: Talking head, Lip sync, TTS, MPEG4.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1332216

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1499

References:

[1] I. S. Pandzic and R. Forchheimer, Edited, MPEG-4 Facial animation, Wiley, England, 2002.
[2] C. Pelachaud, E. Magno-Caldognetto, "Modelling an Italian Head", Audio-visual speech processing, Scheelsminde, Denmark 2001
[3] E. Cosatto, J. Ostermann, H.P. Granf, "Lifelike talking faces for interactive services", Proc. IEEE91(9), 1406-1428, 2003.
[4] S. Morishima and S. Nakamura "Multimodal translation system using texture-mapped lip-sync images for video mail and automatic dubbing applications", EURASIP Journal on Applied Signal processing, pp. 1637-1647, 2004.
[5] G. Zoric and I.S. Pandzic, "Real-time language independent lip sysnchronization method using a genetic algorithm", Signal processing 86, pp. 3644-3656, 2006.