Speech Understanding and Speech Translation in Various Domains

Title Speech Understanding and Speech Translation in Various Domains by Maximum a-posteriori Semantic Decoding
Authors Johannes Müller, Holger Stahl
Abstract This paper describes a domain-limited system for speech understanding as well as for speech translation. An integrated semantic decoder directly converts the preprocessed speech signal into its semantic representation by a maximum a-posteriori classification. With the combination of probabilistic knowledge on acoustic, phonetic, syntactic, and semantic levels, the semantic decoder extracts the most probable meaning of the utterance. Any separate speech recognition stage is not needed because of the integration of the Viterbi-algorithm (calculating acoustic probabilities by the use of Hidden-Markov-Models) and a probabilistic chart parser (calculating semantic and syntactic probabilities by especial models). The semantic structure is introduced as representation of an utterance's meaning. It can be used as intermediate level for a succeeding intention decoder (within a speech understanding system for the control of a running application by spoken inputs) as well as interlingua-level for a succeeding language production unit (within an automatic speech translation system for the creation of spoken output in another language). Following the above principles and using the respective algorithms, speech understanding and speech translation front-ends for the domains 'graphic editor', 'service robot', 'medical image visualization' and 'scheduling dialogues' could be successfully realized.
Reference Proceedings "International Symposium on Engineering of Intelligent Systems" EIS 98 (La Laguna, Spain, 1998), vol. 2 "Neural Networks", pp. 256-267
Year 1998
Language English
Full Paper  download pdf file

© WebDesign by Johannes Müller - Kontakt
erstellt am 14.02.2007, zuletzt geändert am 09.01.2008