CUED Publications database

I-Vectors and Structured Neural Networks for Rapid Adaptation of Acoustic Models

Karanasou, P and Wu, C and Gales, M and Woodland, PC (2017) I-Vectors and Structured Neural Networks for Rapid Adaptation of Acoustic Models. IEEE/ACM Transactions on Audio Speech and Language Processing, 25. pp. 818-828. ISSN 2329-9290

Full text not available from this repository.

Abstract

© 2017 IEEE. A lot of interest has been risen in the last years on the adaptation of deep neural network (DNN) acoustic models, as the latter become the state-of-art in automatic speech recognition. This work focuses on approaches that allow for rapid and robust adaptation of such models. First, i-vectors are added to the DNN input as speaker-informed features. An informative prior is introduced to i-vector estimation to improve the robustness to limited adaptation data. I-vectors are then combined with a structured adaptive DNN, the multibasis adaptive neural network (MBANN), and the complementarity of these adaptation techniques is investigated. Moreover, i-vectors are used to predict the MBANN transforms, avoiding the initial decoding pass and alignment. These approaches are evaluated on a U.S. English Broadcast News (BN) transcription task with two distinct sets of test data. The first, from the BN task and BN-style Youtube videos, yields test data acoustically matched to the training data, while the second set is from acoustically mismatched Youtube videos of diverse context. The performance gains from these schemes are found to be sensitive to the level of mismatch between training and test sets. The MBANN system combined with i-vector input achieves best performance for BN test sets. The i-vector-based predictive MBANN scheme is proven to be more robust to acoustically mismatched conditions and outperforms the other adaptation schemes in such scenarios.

Item Type: Article
Subjects: UNSPECIFIED
Divisions: Div F > Machine Intelligence
Depositing User: Cron Job
Date Deposited: 17 Jul 2017 19:00
Last Modified: 08 Aug 2017 01:53
DOI: