CUED Publications database

Adaptation of deep neural network acoustic models using factorised i-vectors

Karanasou, P and Wang, Y and Gales, MJF and Woodland, PC (2014) Adaptation of deep neural network acoustic models using factorised i-vectors. In: UNSPECIFIED pp. 2180-2184..

Full text not available from this repository.


Copyright © 2014 ISCA. The use of deep neural networks (DNNs) in a hybrid configuration is becoming increasingly popular and successful for speech recognition. One issue with these systems is how to efficiently adapt them to reflect an individual speaker or noise condition. Recently speaker i-vectors have been successfully used as an additional input feature for unsupervised speaker adaptation. In this work the use of i-vectors for adaptation is extended to incorporate acoustic factorisation. In particular, separate i-vectors are computed to represent speaker and acoustic environment. By ensuring "orthogonality" between the individual factor representations it is possible to represent a wide range of speaker and environment pairs by simply combining i-vectors from a particular speaker and a particular environment. In this paper the i-vectors are viewed as the weights of a cluster adaptive training (CAT) system, where the underlying models are GMMs rather than HMMs. This allows the factorisation approaches developed for CAT to be directly applied. Initial experiments were conducted on a noise distorted version of the WSJ corpus. Compared to standard speaker-based i-vector adaptation, factorised i-vectors showed performance gains.

Item Type: Conference or Workshop Item (UNSPECIFIED)
Divisions: Div F > Machine Intelligence
Depositing User: Cron Job
Date Deposited: 17 Jul 2017 19:01
Last Modified: 07 Jun 2018 02:10