Dieh, F and Gales, MJF and Liu, X and Tomalin, M and Woodland, PC (2011) Word boundary modelling and full covariance gaussians for Arabic Speech-to-Text systems. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. pp. 777-780.Full text not available from this repository.
This paper describes recent improvements to the Cambridge Arabic Large Vocabulary Continuous Speech Recognition (LVCSR) Speech-to-Text (STT) system. It is shown that wordboundary context markers provide a powerful method to enhance graphemic systems by implicit phonetic information, improving the modelling capability of graphemic systems. In addition, a robust technique for full covariance Gaussian modelling in the Minimum Phone Error (MPE) training framework is introduced. This reduces the full covariance training to a diagonal covariance training problem, thereby solving related robustness problems. The full system results show that the combined use of these and other techniques within a multi-branch combination framework reduces the Word Error Rate (WER) of the complete system by up to 5.9% relative. Copyright © 2011 ISCA.
|Divisions:||Div F > Machine Intelligence|
|Depositing User:||Cron Job|
|Date Deposited:||18 May 2016 19:12|
|Last Modified:||24 Aug 2016 04:43|