Flego, F and Gales, MJF (2012) Factor analysis based VTS discriminative adaptive training. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. pp. 4669-4672. ISSN 1520-6149Full text not available from this repository.
Vector Taylor Series (VTS) model based compensation is a powerful approach for noise robust speech recognition. An important extension to this approach is VTS adaptive training (VAT), which allows canonical models to be estimated on diverse noise-degraded training data. These canonical model can be estimated using EM-based approaches, allowing simple extensions to discriminative VAT (DVAT). However to ensure a diagonal corrupted speech covariance matrix the Jacobian (loading matrix) relating the noise and clean speech is diagonalised. In this work an approach for yielding optimal diagonal loading matrices based on minimising the expected KL-divergence between the diagonal loading matrix and "correct" distributions is proposed. The performance of DVAT using the standard and optimal diagonalisation was evaluated on both in-car collected data and the Aurora4 task. © 2012 IEEE.
|Divisions:||Div F > Machine Intelligence|
|Depositing User:||Cron Job|
|Date Deposited:||09 Dec 2016 17:28|
|Last Modified:||30 Apr 2017 02:07|