CUED Publications database

Interaction Quality: Assessing the quality of ongoing spoken dialog interaction by experts - And how it relates to user satisfaction

Schmitt, A and Ultes, S (2015) Interaction Quality: Assessing the quality of ongoing spoken dialog interaction by experts - And how it relates to user satisfaction. Speech Communication, 74. pp. 12-36. ISSN 0167-6393

Full text not available from this repository.


© 2015 Elsevier B.V. All rights reserved. This study presents a novel expert-based approach to assess the quality of ongoing Spoken Dialog System (SDS) interactions. We call this approach "Interaction Quality" (IQ). It is an objective measure which relies on statistical classification with Support Vector Machines (SVMs). We compare objective expert IQ annotations of ongoing SDS interactions with subjective User Satisfaction (US) ratings and show that IQ and US correlate (ρ=.66). Expert annotations obviously mirror the subjective user impression to a great extent while they are, above all, much easier to obtain. The IQ score that quantifies the quality of the interaction is generated using the median score of exchange annotations of several experts. US is tracked in a study with 38 users interacting with an SDS. A large, comprehensive set of domain-independent, automatic interaction parameters is introduced to quantify the interaction at arbitrary dialog exchanges. Furthermore, a manually annotated negative emotion feature is added to the parameter set in order to evaluate the contribution of emotions on the classification of IQ and US. For evaluation we use the CMU Let's Go bus information system. The model yields a correlation of ρ=.80 when classifying IQ scores annotated in field data from the CMU system. Furthermore, the model achieves ρ=.74 for predicting US on lab data, and ρ=.89 for IQ on lab data. The presented approach outperforms related studies in the field. Only a marginal contribution of the emotion feature to the performance can be observed, implying that US is not influenced by visible emotions. We analyze causalities and correlations between the interaction parameters and the target variables US/IQ and identify relevant predictors. With the presented paradigm, critical dialogs can be found; once deployed as an online monitoring technique, this paradigm could render SDSs more user friendly and improve user acceptance.

Item Type: Article
Divisions: Div F > Machine Intelligence
Depositing User: Cron Job
Date Deposited: 17 Jul 2017 19:05
Last Modified: 22 May 2018 08:05