Reinforcement learning of potential fields to achieve limit-cycle walking

None, None; None, None; None, None; None, None

Reinforcement learning of potential fields to achieve limit-cycle walking

Conference Paper (2016)

Author(s)

D.S. Feirstein (Student TU Delft)

Ivan Koryakovskiy (TU Delft - Biomechatronics & Human-Machine Control)

Jens Kober (TU Delft - OLD Intelligent Control & Robotics)

Heike Vallery (TU Delft - OLD Biorobotics)

Research Group

Biomechatronics & Human-Machine Control

Copyright

DOI related publication

https://doi.org/10.1016/j.ifacol.2016.07.994

Machine learning Limit cycles Walking Robot control Energy Control

To reference this document use:

https://resolver.tudelft.nl/uuid:b4693303-4c1a-40f6-b20b-f19ba845701f

More Info

expand_more

Publication Year

2016

Language

English

Copyright

Research Group

Biomechatronics & Human-Machine Control

Pages (from-to)

113-118

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Reinforcement learning is a powerful tool to derive controllers for systems where no models are available. Particularly policy search algorithms are suitable for complex systems, to keep learning time manageable and account for continuous state and action spaces. However, these algorithms demand more insight into the system to choose a suitable controller parameterization. This paper investigates a type of policy parameterization for impedance control that allows energy input to be implicitly bounded: Potential fields. In this work, a methodology for generating a potential field-constrained impedance controller via approximation of example trajectories, and subsequently improving the control policy using Reinforcement Learning, is presented. The potential field-const rained approximation is used as a policy parameterization for policy search reinforcement learning and is compared to its unconstrained counterpart. Simulations on a simple biped walking model show the learned controllers are able to surpass the potential field of gravity by generating a stable limit-cycle gait on flat ground for both parameterizations. The potential field-constrained controller provides safety with a known energy bound while performing equally well as the unconstrained policy.

Files

IFAC2016_Reinforcement_Feirste... (pdf)

(pdf | 2.91 Mb)