From Supervised to Reinforcement Learning: an Inverse Optimization Approach

Master Thesis (2021)
Author(s)

I. Dimanidis (TU Delft - Mechanical Engineering)

Contributor(s)

P. Mohajerin Esfahani – Mentor (TU Delft - Team Bart De Schutter)

M. Mazo Espinosa – Graduation committee member (TU Delft - Team Manuel Mazo Jr)

B. Atasoy – Graduation committee member (TU Delft - Transport Engineering and Logistics)

Faculty
Mechanical Engineering
More Info
expand_more
Publication Year
2021
Language
English
Graduation Date
10-12-2021
Awarding Institution
Delft University of Technology
Programme
['Mechanical Engineering | Systems and Control']
Faculty
Mechanical Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

We propose a novel method combining elements of supervised- and Q-learning for the control of dynamical systems subject to unknown disturbances. By using the Inverse Optimization framework and in-hindsight information we can derive a causal parametric optimization policy that approximates a non-causal MPC expert. Furthermore, we propose a new min-max MPC scheme that robustifies against a ball around a disturbance trajectory. This scheme yields an exact convex reformulation using the S-Lemma, and is also approximated using Inverse Optimization. Finally, simulation studies clarify and verify our approach.

Files

Main.pdf
(pdf | 1.35 Mb)
License info not available