Model-plant mismatch compensation using reinforcement learning

Journal Article (2018)
Author(s)

I. Koryakovskiy (TU Delft - Biomechatronics & Human-Machine Control)

M. Kudruss (University of Heidelberg)

Heike Vallery (TU Delft - Biomechatronics & Human-Machine Control)

Robert Babuska (TU Delft - Learning & Autonomous Control)

W. Caarls (Pontifical Catholic University of Rio de Janeiro)

Research Group
Biomechatronics & Human-Machine Control
Copyright
© 2018 I. Koryakovskiy, M. Kudruss, H. Vallery, R. Babuska, W. Caarls
DOI related publication
https://doi.org/10.1109/LRA.2018.2800106
More Info
expand_more
Publication Year
2018
Language
English
Copyright
© 2018 I. Koryakovskiy, M. Kudruss, H. Vallery, R. Babuska, W. Caarls
Research Group
Biomechatronics & Human-Machine Control
Issue number
3
Volume number
3
Pages (from-to)
2471 - 2477
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Learning-based approaches are suitable for the control of systems with unknown dynamics. However, learning from scratch involves many trials with exploratory actions until a good control policy is discovered. Real robots usually cannot withstand the exploratory actions and suffer damage. This problem can be circumvented by combining learning with a model-based control. In this letter, we employ a nominal model-predictive controller that is impeded by the presence of an unknown model-plant mismatch. To compensate for the mismatch, we propose two approaches of combining reinforcement learning with the nominal controller. The first approach learns a compensatory control action that minimizes the same performance measure as is minimized by the nominal controller. The second approach learns a compensatory signal from a difference of a transition predicted by the internal model and an actual transition. We compare the approaches on a robot attached to the ground and performing a setpoint reaching task in simulations. We implement the better approach on the real robot and demonstrate successful learning results.

Files

License info not available