Y. Oren | TU Delft Repository

Epistemic Monte Carlo Tree Search

Conference paper (2025) - Y. Oren (author) , Viliam Vadocz (author) , M.T.J. Spaan (author) , J.W. Böhmer (author)

The AlphaZero/MuZero (A/MZ) family of algorithms has achieved remarkable success across various challenging domains by integrating Monte Carlo Tree Search (MCTS) with learned models. Learned models introduce epistemic uncertainty, which is caused by learning from limited data and ...

Value Improved Actor Critic Algorithms

Preprint (2024) - Y. Oren (author) , M.A. Zanger (author) , P.R. van der Vaart (author) , M.T.J. Spaan (author) , J.W. Böhmer (author)

Many modern reinforcement learning algorithms build on the actor-critic (AC) framework: iterative improvement of a policy (the actor) using policy improvement operators and iterative approximation of the policy's value (the critic). In contrast, the popular value-based algorithm ...

E-MCTS: Deep Exploration in Model-Based Reinforcement Learning by Planning with Epistemic Uncertainty

Preprint (2023) - Y. Oren (author) , M.T.J. Spaan (author) , J.W. Böhmer (author)

One of the most well-studied and highly performing planning approaches used in Model-Based Reinforcement Learning (MBRL) is Monte-Carlo Tree Search (MCTS). Key challenges of MCTS-based MBRL methods remain dedicated deep exploration and reliability in the face of the unknown, and ...