CUED Publications database

Mixed source model and its adapted vocal tract filter estimate for voice transformation and synthesis

Degottex, G and Lanchantin, P and Roebel, A and Rodet, X (2013) Mixed source model and its adapted vocal tract filter estimate for voice transformation and synthesis. Speech Communication, 55. pp. 278-294. ISSN 0167-6393

Full text not available from this repository.


In current methods for voice transformation and speech synthesis, the vocal tract filter is usually assumed to be excited by a flat amplitude spectrum. In this article, we present a method using a mixed source model defined as a mixture of the Liljencrants-Fant (LF) model and Gaus sian noise. Using the LF model, the base approach used in this presented work is therefore close to a vocoder using exogenous input like ARX-based methods or the Glottal Spectral Separation (GSS) method. Such approaches are therefore dedicated to voice processing promising an improved naturalness compared to generic signal models. To estimate the Vocal Tract Filter (VTF), using spectral division like in GSS, we show that a glottal source model can be used with any envelope estimation method conversely to ARX approach where a least square AR solution is used. We therefore derive a VTF estimate which takes into account the amplitude spectra of both deterministic and random components of the glottal source. The proposed mixed source model is controlled by a small set of intuitive and independent parameters. The relevance of this voice production model is evaluated, through listening tests, in the context of resynthesis, HMM-based speech synthesis, breathiness modification and pitch transposition. © 2012 Elsevier B.V. All rights reserved.

Item Type: Article
Divisions: Div F > Machine Intelligence
Depositing User: Cron Job
Date Deposited: 17 Jul 2017 19:30
Last Modified: 17 May 2018 07:47