Tabular Reinforcement Learning Aided by Generalisation Methods

More Info
expand_more

Abstract

Reinforcement learning is a machine learning paradigm that deals with optimisation and learns by interacting with its environment. Tabular reinforcement learning methods are popular because of their relative simplicity combined with good guarantees of finding an optimal solution. The downside is that they suffer from an exponentially growing exploration space, which will obstruct the learning of large tasks. Function approximation is the most commonly used alternative to deal with this problem. It involves interpolating of states that have never been encountered before by learning the values of generalising features. This generalising property is helpful in targeting larger problems and allows for faster learning. Unfortunately, it can be difficult to specify proper features that are required to learn a solution, and convergence cannot always be guaranteed. Generalisation at an even higher level is called transfer learning, where priorly acquired knowledge from one or more tasks is reused to aid the learning process of the next task. This thesis proposes a framework that combines tabular reinforcement learning methods with both of these generalising concepts to achieve a convergent learning process with good generalisation properties. To test the viability of the proposed method, it is used to solve discrete path planning problems. Results of these tests show that simultaneous learning with the help of function approximation in a parallel learning process can be leveraged to achieve a significant reduction in steps in both the first and consecutive tasks.