S.H. Mallick | TU Delft Repository

Distributed and learning-based model predictive control

Doctoral thesis (2026) - S.H. Mallick, B. De Schutter, A. Dabiri

This dissertation advances Model Predictive Control (MPC) by addressing two major challenges: computational complexity and model uncertainty. The research focuses on distributed control and learning-based approaches to facilitate MPC for hybrid systems, large-scale networks, and systems with limited model knowledge.

To reduce computational burden, new distributed MPC methods for piecewise affine systems are developed, providing efficient convex optimisation-based solutions with guarantees on consistency and feasibility. Learning-based policies are also integrated with MPC, shifting computationally intensive tasks offline and enabling efficient control of hybrid systems and autonomous vehicles.

To address uncertainty, reinforcement learning (RL) is combined with MPC to learn uncertain controller components from data. Novel distributed MPC-RL frameworks are proposed for networked systems. Furthermore, centralised MPC-RL controllers are proposed for applications such as greenhouse climate control and energy systems. The results demonstrate that distributed and learning-based MPC can significantly improve scalability, efficiency, and performance in complex real-world control problems. ...

Learning-Based Model Predictive Control for Piecewise Affine Systems with Feasibility Guarantees

Conference paper (2025) - S.H. Mallick, A. Dabiri, B. De Schutter

Online model predictive control (MPC) for piecewise affine (PWA) systems requires the online solution to an optimization problem that implicitly optimizes over the switching sequence of PWA regions, for which the computational burden can be prohibitive. Alternatively, the computation can be moved offline using explicit MPC; however, the online memory requirements and the offline computation can then become excessive. In this work we propose a solution in between online and explicit MPC, addressing the above issues by partially dividing the computation between online and offline. To solve the underlying MPC problem, a policy, learned offline, specifies the sequence of PWA regions that the dynamics must follow, thus reducing the complexity of the remaining optimization problem that solves over only the continuous states and control inputs. We provide a condition, verifiable during learning, that guarantees feasibility of the learned policy’s output, such that an optimal continuous control input can always be found online. Furthermore, a method for iteratively generating training data offline allows the feasible policy to be learned efficiently, reducing the offline computational burden. A numerical experiment demonstrates the effectiveness of the method compared to both online and explicit MPC. ...

Reinforcement learning-based model predictive control for greenhouse climate control

Journal article (2025) - Samuel Mallick, Filippo Airaldi, Azita Dabiri, Congcong Sun, Bart De Schutter

Greenhouse climate control is concerned with maximizing performance in terms of crop yield and resource efficiency. One promising approach is model predictive control (MPC), which leverages a model of the system to optimize the control inputs, while enforcing physical constraints. However, prediction models for greenhouse systems are inherently inaccurate due to the complexity of the real system and the uncertainty in predicted weather profiles. For model-based control approaches such as MPC, this can degrade performance and lead to constraint violations. Existing approaches address uncertainty in the prediction model with robust or stochastic MPC methodology; however, these necessarily reduce crop yield due to conservatism and often bear higher computational loads. In contrast, learning-based control approaches, such as reinforcement learning (RL), can handle uncertainty naturally by leveraging data to improve performance. This work proposes an MPC-based RL control framework to optimize the climate control performance in the presence of prediction uncertainty. The approach employs a parametrized MPC scheme that learns directly from data, in an online fashion, the parametrization of the constraints, prediction model, and optimization cost that minimizes constraint violations and maximizes climate control performance. Simulations show that the approach can learn an MPC controller that significantly outperforms the current state-of-the-art in terms of constraint violations and efficient crop growth. ...

Distributed Model Predictive Control for Piecewise Affine Systems Based on Switching ADMM

Journal article (2025) - Samuel Mallick, Azita Dabiri, Bart De Schutter

This paper presents a novel approach for distributed model predictive control (MPC) for piecewise affine (PWA) systems. Existing approaches rely on solving mixed-integer optimization problems, requiring significant computation power or time. We propose a distributed MPC scheme that requires solving only convex optimization problems. The key contribution is a novel method, based on the alternating direction method of multipliers, for solving the non-convex optimal control problem that arises due to the PWA dynamics. We present a distributed MPC scheme, leveraging this method, that explicitly accounts for the coupling between subsystems by reaching agreement on the values of coupled states. Stability and recursive feasibility are shown under additional assumptions on the underlying system. Two numerical examples are provided, in which the proposed controller is shown to significantly improve the CPU time and closed-loop performance over existing state-of-the-art approaches. ...

Learning-Based MPC for Fuel Efficient Control of Autonomous Vehicles With Discrete Gear Selection

Journal article (2025) - Samuel Mallick, Gianpietro Battocletti, Qizhang Dong, Azita Dabiri, Bart De Schutter

Co-optimization of both vehicle speed and gear position via model predictive control (MPC) has been shown to offer benefits for fuel-efficient autonomous driving. However, optimizing both the vehicle’s continuous dynamics and discrete gear positions may be too computationally intensive for a real-time implementation. This work proposes a learning-based MPC scheme to address this issue. A policy is trained to select and fix the gear positions across the prediction horizon of the MPC controller, leaving a significantly simpler continuous optimization problem to be solved online. In simulation, the proposed approach is shown to have a significantly lower computation burden and a comparable performance, with respect to pure MPC-based co-optimization. ...

Multi-agent reinforcement learning via distributed MPC as a function approximator

Journal article (2024) - Samuel Mallick, Filippo Airaldi, Azita Dabiri, Bart De Schutter

This paper presents a novel approach to multi-agent reinforcement learning (RL) for linear systems with convex polytopic constraints. Existing work on RL has demonstrated the use of model predictive control (MPC) as a function approximator for the policy and value functions. The current paper is the first work to extend this idea to the multi-agent setting. We propose the use of a distributed MPC scheme as a function approximator, with a structure allowing for distributed learning and deployment. We then show that Q-learning updates can be performed distributively without introducing nonstationarity, by reconstructing a centralized learning update. The effectiveness of the approach is demonstrated on a numerical example. ...