Scaling Long-Horizon Online POMDP Planning via Rapid State Space Sampling

Y. Liang, E. Kim, W. Thomason, Z. Kingston, H. Kurniawati, and L. E. Kavraki, “Scaling Long-Horizon Online POMDP Planning via Rapid State Space Sampling,” in Robotics Research, 2025.

Abstract

Partially Observable Markov Decision Processes (POMDPs) are a general and principled framework for motion planning under uncertainty. Despite tremendous improvement in the scalability of POMDP solvers, long-horizon POMDPs (e.g., steps) remain difficult to solve. This paper proposes a new approximate online POMDP solver, called Reference-Based Online POMDP Planning via Rapid State Space Sampling (ROP-RaS3). ROP-RaS3 uses novel extremely fast sampling-based motion planning techniques to sample the state space and generate a diverse set of macro actions online which are then used to bias belief-space sampling and infer high-quality policies without requiring exhaustive enumeration of the action space – a fundamental constraint for modern online POMDP solvers. ROP-RaS3 is evaluated on various long-horizon POMDPs, including on a problem with a planning horizon of more than 100 steps and a problem with a 15-dimensional state space that requires more than 20 look ahead steps. In all of these problems, ROP-RaS3 substantially outperforms other state-of-the-art methods by up to multiple folds.

PDF preprint: http://kavrakilab.org/publications/liang2024-scaling.pdf