CUED Publications database

Speech intonation for TTS: Study on evaluation methodology

Latorre, J and Yanagisawa, K and Wan, V and Kolluru, BK and Gales, MJF (2014) Speech intonation for TTS: Study on evaluation methodology. In: UNSPECIFIED pp. 2957-2961..

Full text not available from this repository.


Copyright © 2014 ISCA. The standard evaluation of intonation models is by means of non-referenced subjective tests (pair or MOS) in which subjects rate the quality or compare different samples without any explicit reference. These tests are usually conducted on an isolated sentence basis. However, for a single sentence, with no contextual information, there are multiple valid intonations. A subject's preference over this range of intonation patterns may be highly personal. This paper investigates the degree to which this ambiguity in the appropriate intonation pattern impacts the assessments of prosody for speech synthesis systems. To examine this problem, the variance of the F0 pattern of several vocoded sentences was modified and subjects asked to compare multiple versions with different levels of modification in terms of preference/quality. Then, they were presented with the reference which defines the original intonation and asked about the similarity to that reference. The results show that subjects can identify the samples with no F0 variance modification when given a reference but they don't always prefer them. Thus, non-referenced tests with no context, though may help to analyse user acceptability, may not be appropriate to measure the performance of intonation models.

Item Type: Conference or Workshop Item (UNSPECIFIED)
Divisions: Div F > Machine Intelligence
Depositing User: Cron Job
Date Deposited: 17 Jul 2017 19:32
Last Modified: 19 Jul 2018 07:55