The Effects of Large Disturbances on On-Line Reinforcement Learning for aWalking Robot

Conference Paper (2010)
Contributor(s)

Copyright
© 2010 The Author(s)
More Info
expand_more
Publication Year
2010
Copyright
© 2010 The Author(s)
Related content
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Reinforcement Learning is a promising paradigm for adding learning capabilities to humanoid robots. One of the difficulties of the real world is the presence of disturbances. In Reinforcement Learning, disturbances are typically dealt with stochastically. However, large and infrequent disturbances do not fit well in this framework; essentially, they are outliers and not part of the underlying (stochastic) Markov Decision Process. Therefore, they can negatively influence learning. The main reasons for such disturbances for a humanoid robot are sudden changes in the dynamics (such as a sudden push), sensor noise and sampling time irregularities. We investigate the effects of these types of outliers on the on-line learning process of a simple walking robot simulation. We propose to exclude the outliers from the learning process with the aim to improve convergence and the final solution. While infrequent sensor and timing outliers had a negligible influence, infrequent pushes heavily disrupted the learning process. By excluding the outliers from the learning process, performance was again restored.

Files

Schuitema.pdf
(pdf | 0.892 Mb)
License info not available