S S, S and Nicolas, C and N, W (2010) Bayesian learning of noisy Markov decision processes. Technical Report. Cambridge University Engineering Department.Full text not available from this repository.
This work addresses the problem of estimating the optimal value function in a Markov Decision Process from observed state-action pairs. We adopt a Bayesian approach to inference, which allows both the model to be estimated and predictions about actions to be made in a unified framework, providing a principled approach to mimicry of a controller on the basis of observed data. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from theposterior distribution over the optimal value function. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.
|Item Type:||Monograph (Technical Report)|
|Uncontrolled Keywords:||Markov Decision Process, Bayesian learning, Markov Chain Monte Carlo, Data augmentation, Parameter expansion|
|Depositing User:||Unnamed user with email firstname.lastname@example.org|
|Date Deposited:||09 Dec 2016 17:25|
|Last Modified:||09 Dec 2016 17:25|