CUED Publications database

A Pulse Model in Log-domain for a Uniform Synthesizer

Degottex, G and Lanchantin, P and Gales, M (2016) A Pulse Model in Log-domain for a Uniform Synthesizer. In: 9th ISCA Speech Synthesis Workshop, 2016-9-13 to 2016-9-15, Sunnyvale, CA, USA pp. 230-236..

Full text not available from this repository.

Abstract

The quality of the vocoder plays a crucial role in the performance of parametric speech synthesis systems. In order to improve the vocoder quality, it is necessary to reconstruct as much of the perceived components of the speech signal as possible. In this paper, we first show that the noise component is currently not accurately modelled in the widely used STRAIGHT vocoder, thus, limiting the voice range that can be covered and also limiting the overall quality. In order to motivate a new, alternative, approach to this issue, we present a new synthesizer, which uses a uniform representation for voiced and unvoiced segments. This synthesizer has also the advantage of using a simple signal model compared to other approaches, thus offering a convenient and controlled alternative for future developments. Experiments analysing the synthesis quality of the noise component shows improved speech reconstruction using the suggested synthesizer compared to STRAIGHT. Additionally an experiment about analysis/resynthesis shows that the suggested synthesizer solves some of the issues of another uniform vocoder, Harmonic Model plus Phase Distortion (HMPD). In text-to-speech synthesis, it outperforms HMPD and exhibits a similar, or only slightly worse, quality to STRAIGHT’s quality, which is encouraging for a new vocoding approach.

Item Type: Conference or Workshop Item (UNSPECIFIED)
Uncontrolled Keywords: parametric speech synthesis vocoder pulse model
Subjects: UNSPECIFIED
Divisions: Div F > Machine Intelligence
Depositing User: Cron Job
Date Deposited: 17 Jul 2017 20:01
Last Modified: 27 Jul 2017 05:26
DOI: