Consolidated Deep Actor Critic Networks

Master Thesis (2015)
Contributor(s)

M. Loog – Mentor

J. Kober – Mentor

Copyright
© 2015 Van der Laan, T.A.
More Info
expand_more
Publication Year
2015
Copyright
© 2015 Van der Laan, T.A.
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The works [Volodymyr et al. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.] and [Volodymyr et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.] have demonstrated the power of combining deep neural networks with Watkins Q learning. They introduce deep Q networks (DQN) that learn to associate High dimensional inputs with Q values in order to produce discrete actions, allowing the system to learn complex strategies and play Atari games such as Breakout and Space invaders. Although powerful the system is limited to discrete actions. If we wish to control more complex systems like robots we need the ability to output multidimensional continuous actions. In this paper we investigate how to combine deep neural networks with actor critic models which have the ability to output multidimensional continuous actions. We name this class of systems deep actor critic networks (DACN) following the DQN naming convention. We derive and experiment with four methods to update the actor. We then consolidate the actor and critic networks into one unified network which we name consolidated deep actor critic networks (C-DACN). We hypothesize that consolidating the actor and critic networks might lead to faster convergence. We test the system in two environments named Acrobot (under actuated double pendulum) and Bounce (continuous action Atari Breakout look alike).

Files

License info not available