Exploration When Everything Looks New
Effect of the Local Uncertainty Source on Exploration
V. Vadocz (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Yaniv Oren – Mentor (TU Delft - Sequential Decision Making)
Matthijs Spaan – Mentor (TU Delft - Sequential Decision Making)
Neil Yorke-Smith – Graduation committee member (TU Delft - Algorithmics)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Agents improve by interacting with an environment and planning. By leveraging information about what they don't know, they can learn better and faster, at least in environments that benefit from exploring. They do this by estimating the uncertainty in their predictions. There are choices for how to estimate the uncertainty, and in this work, we look at what effect this choice has on the exploration and strength of agents playing board games. We compare the effect of a source of uncertainty which perfectly tracks what the agent has seen, and a source which generalizes. We also describe the challenges associated with tuning uncertainty estimators and show what considerations have to be made when exploration is not all you need.