Poupart, P and Kim, KE and Kim, D (2011) Closing the gap: Improved bounds on optimal POMDP solutions. ICAPS 2011 - Proceedings of the 21st International Conference on Automated Planning and Scheduling. pp. 194-201.Full text not available from this repository.
POMDP algorithms have made significant progress in recent years by allowing practitioners to find good solutions to increasingly large problems. Most approaches (including point-based and policy iteration techniques) operate by refining a lower bound of the optimal value function. Several approaches (e.g., HSVI2, SARSOP, grid-based approaches and online forward search) also refine an upper bound. However, approximating the optimal value function by an upper bound is computationally expensive and therefore tightness is often sacrificed to improve efficiency (e.g., sawtooth approximation). In this paper, we describe a new approach to efficiently compute tighter bounds by i) conducting a prioritized breadth first search over the reachable beliefs, ii) propagating upper bound improvements with an augmented POMDP and iii) using exact linear programming (instead of the sawtooth approximation) for upper bound interpolation. As a result, we can represent the bounds more compactly and significantly reduce the gap between upper and lower bounds on several benchmark problems. Copyright © 2011, Association for the Advancement of Artificial Intelligence. All rights reserved.
|Divisions:||Div F > Machine Intelligence|
|Depositing User:||Unnamed user with email email@example.com|
|Date Deposited:||09 Dec 2016 17:11|
|Last Modified:||10 Dec 2016 21:53|