Model-based reinforcement learning for predictive control and optimisation

None, None

doi:10.4233/uuid:f0e6d5cb-949b-46ca-ba64-e5a0e924d9bb

Model-based reinforcement learning for predictive control and optimisation

Doctoral Thesis (2026)

Author(s)

F. Airaldi (TU Delft - Team Azita Dabiri)

Contributor(s)

B. De Schutter – Promotor (TU Delft - Delft Center for Systems and Control)

A. Dabiri – Copromotor (TU Delft - Team Azita Dabiri)

Research Group

Team Azita Dabiri

Model Predictive Control Global Optimisation Model-based Reinforcement Learning Safe Control

DOI related publication

https://doi.org/10.4233/uuid:f0e6d5cb-949b-46ca-ba64-e5a0e924d9bb Final published version

To reference this document use

https://doi.org/10.4233/uuid:f0e6d5cb-949b-46ca-ba64-e5a0e924d9bb

More Info

expand_more

Publication Year

2026

Language

English

Defense Date

11-05-2026

Awarding Institution

Delft University of Technology

Research Group

Team Azita Dabiri

Downloads counter

36

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In the current age of emerging autonomous and artificial-intelligence-driven machines, sequential decision making constitutes one of the theoretical fundaments at the core of intelligent agency. As these systems are increasingly deployed in real-world engineering applications (e.g., autonomous vehicles and drones, as well as smart energy grids and greenhouses), there is a growing need for the control architectures governing these agents to meet, aside from traditional performance requirements, also interpretability and safety criteria, while encouraging adaptability and scalability. Classical model-based methodologies, such as Model Predictive Control (MPC), can in general provide rigorous frameworks that integrate a priori knowledge (e.g., via explicit, though often approximate, prediction models) and can handle constraints to enforce safety, yet their performance is tightly coupled with the accuracy of the underlying model and expert manual tuning of its parameters. Conversely, purely model-free approaches, such as deep Reinforcement Learning (RL), offer remarkable data-driven adaptation, but often lack interpretability and reliable constraint handling required to provide formal guarantees.
This work expands on the current state-of-the-art results that combine these two distinct approaches into a single framework. While not always straightforward, it is well known that endowing these decision-making processes with model-based knowledge can not only enhance their performance but also benefit their interpretability and analysis: model-based RL, also known as Approximate Dynamic Programming (ADP), is perhaps the most renowned machine learning paradigm to craft these intelligent predictive agents. This dissertation aims to look at RL from a different perspective. Instead of as an alternative to model-based control, RL is used as a performance-enhancing mechanism operating within rigorously defined safety requirements. Concurrently, this thesis establishes MPC as a unifying and scalable foundation block for learning-based control and optimisation for constrained, uncertain, and distributed decision-making systems......

Files

Dissertation-with-cover.pdf

(pdf | 2.57 Mb)

License info not available