Lefèvre, F and Mairesse, F and Young, S (2010) Cross-Lingual spoken language understanding from unaligned data using discriminative classification models and machine translation. Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. pp. 78-81.Full text not available from this repository.
This paper investigates several approaches to bootstrapping a new spoken language understanding (SLU) component in a target language given a large dataset of semantically-annotated utterances in some other source language. The aim is to reduce the cost associated with porting a spoken dialogue system from one language to another by minimising the amount of data required in the target language. Since word-level semantic annotations are costly, Semantic Tuple Classifiers (STCs) are used in conjunction with statistical machine translation models both of which are trained from unaligned data to further reduce development time. The paper presents experiments in which a French SLU component in the tourist information domain is bootstrapped from English data. Results show that training STCs on automatically translated data produced the best performance for predicting the utterance's dialogue act type, however individual slot/value pairs are best predicted by training STCs on the source language and using them to decode translated utterances. © 2010 ISCA.
|Uncontrolled Keywords:||Bootstrapping Portability Spoken dialogue system Spoken language understanding|
|Divisions:||Div F > Machine Intelligence|
|Depositing User:||Cron Job|
|Date Deposited:||07 Mar 2014 12:03|
|Last Modified:||27 Nov 2014 19:19|