AD

Anna Deichler

1 records found

Think Too Fast Nor Too Slow

The Computational Trade-off Between Planning And Reinforcement Learning

Planning and reinforcement learning are two key approaches to sequential decision making. Multi-step approximate real-time dynamic programming, a recently successful algorithm class of which AlphaZero [Silver et al., 2018] is an example, combines both by nesting planning within a ...