Model-plant mismatch compensation using reinforcement learning

None, None; None, None; None, None; None, None; None, None

Model-plant mismatch compensation using reinforcement learning

Journal Article (2018)

Author(s)

I. Koryakovskiy (TU Delft - Biomechatronics & Human-Machine Control)

M. Kudruss (University of Heidelberg)

Heike Vallery (TU Delft - Biomechatronics & Human-Machine Control)

Robert Babuska (TU Delft - Learning & Autonomous Control)

W. Caarls (Pontifical Catholic University of Rio de Janeiro)

Research Group

Biomechatronics & Human-Machine Control

Copyright

DOI related publication

https://doi.org/10.1109/LRA.2018.2800106

To reference this document use:

https://resolver.tudelft.nl/uuid:4ea9a42f-9d8a-4080-936e-d603feaac5ab

More Info

expand_more

Publication Year

2018

Language

English

Copyright

Research Group

Biomechatronics & Human-Machine Control

Issue number

3

Volume number

3

Pages (from-to)

2471 - 2477

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Learning-based approaches are suitable for the control of systems with unknown dynamics. However, learning from scratch involves many trials with exploratory actions until a good control policy is discovered. Real robots usually cannot withstand the exploratory actions and suffer damage. This problem can be circumvented by combining learning with a model-based control. In this letter, we employ a nominal model-predictive controller that is impeded by the presence of an unknown model-plant mismatch. To compensate for the mismatch, we propose two approaches of combining reinforcement learning with the nominal controller. The first approach learns a compensatory control action that minimizes the same performance measure as is minimized by the nominal controller. The second approach learns a compensatory signal from a difference of a transition predicted by the internal model and an actual transition. We compare the approaches on a robot attached to the ground and performing a setpoint reaching task in simulations. We implement the better approach on the real robot and demonstrate successful learning results.

Files

RAL2018_Model_plant_Koryakovsk... (pdf)

(pdf | 1.43 Mb)

License info not available