Point-Based Value Iteration for Finite-Horizon POMDPs

None, None; None, None

Point-Based Value Iteration for Finite-Horizon POMDPs

Journal Article (2019)

Author(s)

Erwin Walraven (TU Delft - Algorithmics)

Matthijs T.J. Spaan (TU Delft - Algorithmics)

URL related publication

https://www.jair.org/index.php/jair/article/view/11324 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:18992e92-eb06-445b-80cc-5cd1fc28afef

More Info

expand_more

Publication Year

2019

Language

English

Volume number

65

Pages (from-to)

307-341

Downloads counter

149

Collections

Institutional Repository

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Partially Observable Markov Decision Processes (POMDPs) are a popular formalism for sequential decision making in partially observable environments. Since solving POMDPs to optimality is a difficult task, point-based value iteration methods are widely used. These methods compute an approximate POMDP solution, and in some cases they even provide guarantees on the solution quality, but these algorithms have been designed for problems with an infinite planning horizon. In this paper we discuss why state-of-the-art point-based algorithms cannot be easily applied to finite-horizon problems that do not include discounting. Subsequently, we present a general point-based value iteration algorithm for finite-horizon problems which provides solutions with guarantees on solution quality. Furthermore, we introduce two heuristics to reduce the number of belief points considered during execution, which lowers the computational requirements. In experiments we demonstrate that the algorithm is an effective method for solving finite-horizon POMDPs.

Files

Jair2019.pdf

(pdf | 1.05 Mb)

License info not available