Reinforcement Learning (RL) is a learning paradigm that learns by interacting with the environment. In practice, a RL agent needs to perform many actions to sample rewards and state transitions from their environments. Recent advances in using deep neural networks as function app
...
Reinforcement Learning (RL) is a learning paradigm that learns by interacting with the environment. In practice, a RL agent needs to perform many actions to sample rewards and state transitions from their environments. Recent advances in using deep neural networks as function approximators reduce the sample complexity in very high dimensional environments. Another way of reducing the sample complexity of reinforcement learning agents is by shaping the learning path into a curriculum. Flexible curriculum learning is a way of shaping where the action and/or state dimensionality of the agent is gradually increased, in an effort to slowly increase complexity of the problem. In this research, a deep reinforcement learning architecture, Deep Deterministic Policy Gradients (DDPG), is modified to learn in a flexible way. The flexible DDPG is evaluated on a problem where no coordination is required to test basic functionality of the flexible DDPG algorithm. In addition, the flexible DDPG algorithm is evaluated on a problem where coordination is required. Results show that the flexible DDPG algorithm has a comparable sample complexity to a non-flexible approach.