Safe data-driven model predictive control of systems with complex dynamics

Journal article (2023)

Authors

Ioanna Mitsioni KTH Royal Institute of Technology

Pouria Tajvar KTH Royal Institute of Technology

Danica Kragic KTH Royal Institute of Technology

Jana Tumova KTH Royal Institute of Technology

Christian Pek KTH Royal Institute of Technology

Affiliation

External organisation

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:ff10c27b-854d-40ff-b015-2d6cd857f27f

Published Date

2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Affiliation

External organisation

Abstract

In this article, we address the task and safety performance of data-driven model predictive controllers (DD-MPC) for systems with complex dynamics, i.e., temporally or spatially varying dynamics that may also be discontinuous. The three challenges we focus on are the accuracy of learned models, the receding horizon-induced myopic predictions of DD-MPC, and the active encouragement of safety. To learn accurate models for DD-MPC, we cautiously, yet effectively, explore the dynamical system with rapidly exploring random trees (RRT) to collect a uniform distribution of samples in the state-input space and overcome the common distribution shift in model learning. The learned model is further used to construct an RRT tree that estimates how close the model's predictions are to the desired target. This information is used in the cost function of the DD-MPC to minimize the short-sighted effect of its receding horizon nature. To promote safety, we approximate sets of safe states using demonstrations of exclusively safe trajectories, i.e., without unsafe examples, and encourage the controller to generate trajectories close to the sets. As a running example, we use a broken version of an inverted pendulum where the friction abruptly changes in certain regions. Furthermore, we showcase the adaptation of our method to a real-world robotic application with complex dynamics: robotic food-cutting. Our results show that our proposed control framework effectively avoids unsafe states with higher success rates than baseline controllers that employ models from controlled demonstrations and even random actions.