IMUROSA - Integration of Multimedia Signal Processing Methods into Multimodal Interface and Network Applications

(Integrácia metód spracovania MUltimediálnych signálov do multimodálneho ROzhrania a Sieťových Aplikácií)

VEGA 1/0708/13, (2013-2015)

Anotácia

Spracovanie signálov v rôznych odvetviach ľudskej činnosti ponúka novú dimenziu vnímania sveta. Oblasť záujmu vedecko-výskumného kolektívu predkladaného projektu je využitie algoritmov číslicového spracovania signálov. V rámci tejto definície sa vedecký výskum bude orientovať do nasledujúcich oblastí: vývoj algoritmov detekcie a rozpoznania reči, identifikácia hovoriaceho, syntéza reči a jej využitie pre kompresiu audio signálu, modifikácia prozódie syntetizovanej reči, algoritmy modelovania hlavy a animácie 3D modelu hlavy so zameraním na prenos multimediálnej informácie hovoru cez telekomunikačný kanál. Teoretické výsledky vo všetkých spomínaných oblastiach bude riešiteľský kolektív ďalej využívať v pripravovaných ako aj rozpracovaných aplikáciach, napr. v návrhu multimediálneho rozhrania človek-stroj. Výsledky budú využité na riešenie prijatého projektu 7. rámcového programu FP7-ICT-2011-7 HBB-Next (2011-2014, Gregor Rozinaj, koordinátor za FEI STU) v návrhu multimodálneho rozhrania pre systémy HBB.

Anotation

Signal processing in various areas of human activity offers new dimension of world perception. Field of interest of the research team is the utilization of digital signal processing algorithms. Within this definition, the scientific research is oriented to the following areas: development of algorithms for speech detection and recognition, speaker identification, speech synthesis and its utilization for compression of audio signals, prosody modification of synthesized speech, virtual head modeling algorithms and animation of a 3D model of human head focused on transmission of multimedia information call through a telecommunication channel. The scientific research team will further use the theoretical outcomes from all of the mentioned areas in being prepared and running applications. The project outcomes will be used in an accepted project of the 7th Frame Programme FP7-ICT-2011-7 HBB-Next (2011-2014, Rozinaj, FEI STU coordinator) in the design of multimodal interface for HBB systems.

Key words

multimedia, speech and image processing, multimodal interface, multimedia services

Scientific goals of the project

·         Design and optimization of speech features for speech recognition

·         Modification and adjustment of training schemes and learning rules for HMM models

·         Realization of a speech decoder and its optimization for specific tasks in Slovak

·         Design and evaluation of speech features and classification methods eligible for speaker identification and realization of an identification system

·         Design and optimization of compression method for corpus-based Slovak speech synthesis

·         Selection of compression algorithms and their optimization for audio and speech

·         Analysis of Slovak speech prosody and implementation of methods for its modification

·         Analysis and design of methods for optimal corpus creation with prosody and prediction of prosody based on text analysis

·         Selection and implementation of methods for singing voice synthesis and utilization of these methods for intonation modeling in speech synthesis

·         Analysis and design of methods for Slovak speech synthesis with basic emotions and integration of the existing high level synthesis with the low level synthesis

·         Analysis and selection of algorithms for speaker voice conversion

·         Weights optimization for parameters of target cost at unit selection synthesis

·         Integration of unit selection synthesis with diphone synthesis

·         Design and implementation of speech synthesis in a mobile phone

·         Development of algorithms and software tools for automatic processing and analysis of speech corpuses

·         Design and verification of methods detcting the structure of a video sequence

·         Design and optimization of a method used for localization of facial features based on textures using Gabor filters

·         Design, implementation and evaluation of a HCI using web Technologies

 

Research team

Rozinaj Gregor, doc. Ing. PhD principal coordinator

Kačur Juraj, Ing. PhD. vice-coordinator

Podhradský Pavol, prof. Ing. PhD

Turi nagy Martin, Ing., PhD.

Radoslav Vargic, Ing., PhD.

Mikoczy Eugen, Ing., PhD.

Vojtko Juraj, Ing., PhD.

Schumann Sebastian, Ing.

Vrabec Ján, Ing.

Kőrősi Ján, Ing.

Kondelová Anna, Ing.

Tóth Ján, Ing.

Vasek Matúš, Ing.

Vančo Marek, Ing.

Varga Mário, Ing.

Minárik Ivan, Ing.

Londák Juraj, Ing.

Drozd Ivan, Ing.

Blichár Juraj, Ing.  

Sýkora Tomáš, Ing.