CUED Publications database

Improving speech transcription for Mandarin-english translation

Tomalin, M and Gales, MJF and Liu, XA and Sim, KC and Sinha, R and Wang, L and Woodland, PC and Yu, K (2007) Improving speech transcription for Mandarin-english translation. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 4. IV97-IV100. ISSN 1520-6149

Full text not available from this repository.

Abstract

This paper describes the development of the CU-HTK Mandarin Speech-To-Text (STT) system and assesses its performance as part of a transcription-translation pipeline which converts broadcast Mandarin audio into English text. Recent improvements to the STT system are described and these give Character Error Rate (CER) gains of 14.3% absolute for a Broadcast Conversation (BC) task and 5.1% absolute for a Broadcast News (BN) task. The output of these STT systems is then post-processed, so that it consists of sentence-like segments, and translated into English text using a Statistical Machine Translation (SMT) system. The performance of the transcription-translation pipeline is evaluated using the Translation Edit Rate (TER) and BLEU metrics. It is shown that improving both the STT system and the post-STT segmentations can lower the TER scores by up to 5.3% absolute and increase the BLEU scores by up to 2.7% absolute. © 2007 IEEE.

Item Type: Article
Uncontrolled Keywords: Machine translation Sentence boundary detection Speech recognition
Subjects: UNSPECIFIED
Divisions: Div F > Machine Intelligence
Depositing User: Cron Job
Date Deposited: 07 Mar 2014 12:15
Last Modified: 26 Nov 2014 19:05
DOI: 10.1109/ICASSP.2007.367172