Multimodal User State Recognition in a Modern Dialogue System

Adelhardt, Johann; Shi, Rui P.; Frank, Carmen; Zeißler, Viktor; Zeiler, V.; Batliner, Anton; Nöth, Elmar; Niemann, Heinrich

Published in

Springer Verlag, Lecture Notes in Computer Science, p. 591-605

DOI: 10.1007/978-3-540-39451-8_43

Tools

Export citation

Search in Google Scholar

Multimodal User State Recognition in a Modern Dialogue System

Journal article published in 2003 by Johann Adelhardt, Rui P. Shi, Carmen Frank, Viktor Zeißler, V. Zeiler, Anton Batliner, Elmar Nöth

, Heinrich Niemann

This paper is available in a repository.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

A new direction in improving automatic dialogue systems is to make a human-machine dialogue more similar to a human-human dialogue. A modern system should be able to recognize the semantic content of spoken utterances but also to interpret some paralinguistic or non-verbal information — as indicators of the internal user state —i n order to detect success or trouble in communication. A common problem in a human-machine dialogue, where information about a users internal state of mind may give a clue, is, for instance, the recurrent misunder- standing of the user by the system. This can be prevented if we detect the anger in the users voice. In contrast to anger, a joyful face combined with a pleased voice may indicate a satisfied user, who wants to go on with the current dialogue behavior, while a hesitant searching gesture of the user reveals his unsureness. This paper explores the possibility of recognizing a user's internal state by using facial expression classifica- tion with eigenfaces and a prosodic classifier based on artificial neural networks combined with a discrete Hidden Markov Model (HMM) for ges- ture analysis in parallel. Our experiments show that all the three input modalities can be used to identify a users internal state. However, a user state is not always indicated by all three modalities at the same time; thus a fusion of the different modalities seems to be necessary. Different ways of modality fusion are discussed.

Published in

Links

Tools

Multimodal User State Recognition in a Modern Dialogue System

Abstract