Point-Based Policy Synthesis for POMDPs with Boolean and Quantitative Objectives

Y. Wang, S. Chaudhuri, and L. E. Kavraki, “Point-Based Policy Synthesis for POMDPs with Boolean and Quantitative Objectives,” IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 1860–1867, Apr. 2019.

Abstract

Effectively planning robust executions under uncertainty is critical for building autonomous robots. Partially Observable Markov Decision Processes (POMDPs) provide a standard framework for modeling many robot applications under uncertainty. We study POMDPs with two kinds of objectives: (1) boolean objectives for a correctness guarantee of accomplishing tasks and (2) quantitative objectives for optimal behaviors. For robotic domains that require both correctness and optimality, POMDPs with boolean and quantitative objectives are natural formulations. We present a practical policy synthesis approach for POMDPs with boolean and quantitative objectives by combining policy iteration and policy synthesis for POMDPs with only boolean objectives. To improve efficiency, our approach produces approximate policies by performing the point-based backup on a small set of representative beliefs. Despite being approximate, our approach maintains validity (satisfying boolean objectives) and guarantees improved policies at each iteration before termination. Moreover, the error due to approximation is bounded. We evaluate our approach in several robotic domains. The results show that our approach produces good approximate policies that guarantee task completion.

Publisher: http://dx.doi.org/10.1109/LRA.2019.2898045

PDF preprint: http://kavrakilab.org/publications/wang2019point-based-policy.pdf