Talking head
Keywords
talking head, face animation, face detection, speech synthesis, visemes
Abstract
The project of talking head covers three research areas.
The first one is speech synthesis, the second one is face and facial
features detection from a photograph and adaptation of existing 3D
model of human head to the original, and finally, simulation of
human speaking by the 3D model. The speech synthesis is based on S2
corpus based synthesizer that selects and re-sequences speech units
from a pre-recorded speech database. The model adaptation is made by
FaceSimulator application. It uses two photographs, frontal
and side face images, to detect the features. The detection is based
on human skin chromaticity and morphological characteristics of the
human head. The application is capable of nose, eyes, mouth, chin,
forehead and brow and eyes’ color detection. It also calculates
the skin texture for the model. The foundations of the model of the
human face and its visemes that we use, were taken from the project
FaceGen. The viseme, we are referring to, is a deformed model
of the face. This is not just any kind of deformation; it is a
deformation as if the face was saying a given phoneme. The animation
of speaking is realized by interpolation of viseme models. The
interpolation is used in order to get the positions of nodes of the
model in a certain time.
You can download a sample of talking head here.
You can download a demo of FaceSimulator here.