Searched for: +
(1 - 1 of 1)
document
Moerland, T.M. (author), Deichler, Anna (author), Baldi, S. (author), Broekens, D.J. (author), Jonker, C.M. (author)
Planning and reinforcement learning are two key approaches to sequential decision making. Multi-step approximate real-time dynamic programming, a recently successful algorithm class of which AlphaZero [Silver et al., 2018] is an example, combines both by nesting planning within a learning loop. However, the combination of planning and learning...
book chapter 2020