Bell, PJ and Gales, MJF and Lanchantin, P and Liu, X and Long, Y and Renals, S and Swietojanski, P and Woodland, PC (2012) Transcription of multi-genre media archives using out-of-domain data. 2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings. pp. 324-329.Full text not available from this repository.
We describe our work on developing a speech recognition system for multi-genre media archives. The high diversity of the data makes this a challenging recognition task, which may benefit from systems trained on a combination of in-domain and out-of-domain data. Working with tandem HMMs, we present Multi-level Adaptive Networks (MLAN), a novel technique for incorporating information from out-of-domain posterior features using deep neural networks. We show that it provides a substantial reduction in WER over other systems, with relative WER reductions of 15% over a PLP baseline, 9% over in-domain tandem features and 8% over the best out-of-domain tandem features. © 2012 IEEE.
|Uncontrolled Keywords:||cross-domain adaptation media archives speech recognition tandem|
|Divisions:||Div F > Machine Intelligence|
|Depositing User:||Cron Job|
|Date Deposited:||07 Mar 2014 12:08|
|Last Modified:||08 Dec 2014 02:27|