CUED Publications database

Gaussian processes for fast policy optimisation of POMDP-based dialogue managers

Gasic, M and Jurčíček, F and Keizer, S and Mairesse, F and Thomson, B and Yu, K and Young, S (2010) Gaussian processes for fast policy optimisation of POMDP-based dialogue managers. Proceedings of the SIGDIAL 2010 Conference: 11th Annual Meeting of the Special Interest Group onDiscourse and Dialogue. pp. 201-204.

Full text not available from this repository.

Abstract

Modelling dialogue as a Partially Observable Markov Decision Process (POMDP) enables a dialogue policy robust to speech understanding errors to be learnt. However, a major challenge in POMDP policy learning is to maintain tractability, so the use of approximation is inevitable. We propose applying Gaussian Processes in Reinforcement learning of optimal POMDP dialogue policies, in order (1) to make the learning process faster and (2) to obtain an estimate of the uncertainty of the approximation. We first demonstrate the idea on a simple voice mail dialogue task and then apply this method to a real-world tourist information dialogue task. © 2010 Association for Computational Linguistics.

Item Type: Article
Subjects: UNSPECIFIED
Divisions: Div F > Machine Intelligence
Depositing User: Cron Job
Date Deposited: 07 Mar 2014 12:10
Last Modified: 16 Dec 2014 19:06
DOI: