CUED Publications database

Policy optimisation of POMDP-based dialogue systems without state space compression

Gašić, M and Henderson, M and Thomson, B and Tsiakoulis, P and Young, S (2012) Policy optimisation of POMDP-based dialogue systems without state space compression. 2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings. pp. 31-36.

Full text not available from this repository.

Abstract

The partially observable Markov decision process (POMDP) has been proposed as a dialogue model that enables automatic improvement of the dialogue policy and robustness to speech understanding errors. It requires, however, a large number of dialogues to train the dialogue policy. Gaussian processes (GP) have recently been applied to POMDP dialogue management optimisation showing an ability to substantially increase the speed of learning. Here, we investigate this further using the Bayesian Update of Dialogue State dialogue manager. We show that it is possible to apply Gaussian processes directly to the belief state, removing the need for a parametric policy representation. In addition, the resulting policy learns significantly faster while maintaining operational performance. © 2012 IEEE.

Item Type: Article
Uncontrolled Keywords: Gaussian process POMDP statistical dialogue modelling
Subjects: UNSPECIFIED
Divisions: Div F > Machine Intelligence
Depositing User: Cron Job
Date Deposited: 07 Mar 2014 11:47
Last Modified: 16 Dec 2014 19:06
DOI: 10.1109/SLT.2012.6424165