CUED Publications database

Selection of multi-genre broadcast data for the training of automatic speech recognition systems

Lanchantin, P and Gales, MJF and Karanasou, P and Liu, X and Qian, Y and Wang, L and Woodland, PC and Zhang, C (2016) Selection of multi-genre broadcast data for the training of automatic speech recognition systems. In: UNSPECIFIED pp. 3057-3061..

Full text not available from this repository.

Abstract

Copyright © 2016 ISCA. This paper compares schemes for the selection of multi-genre broadcast data and corresponding transcriptions for speech recognition model training. Selections of the same amount of data (700 hours) from lightly supervised alignments based on the same original subtitle transcripts are compared. Data segments were selected according to a maximum phone matched error rate between the lightly supervised decoding and the original transcript. The data selected with an improved lightly supervised system yields lower word error rates (WERs). Detailed comparisons of the data selected on carefully transcribed development data show how the selected portions match the true phone error rate for each genre. From a broader perspective, it is shown that for different genres, either the original subtitles or the lightly supervised output should be used for model training and a suitable combination yields further reductions in final WER.

Item Type: Conference or Workshop Item (UNSPECIFIED)
Subjects: UNSPECIFIED
Divisions: Div F > Machine Intelligence
Depositing User: Cron Job
Date Deposited: 17 Jul 2017 19:00
Last Modified: 07 Sep 2017 01:42
DOI: