M.M. de Weerdt | TU Delft Repository

Tight MIP Formulations for Optimal Operation and Investment of Storage Including Reserves

Preprint (2026) - M.B. Elgersma, M.M. de Weerdt, K.I. Aardal, G.A. Morales España, Niina Helistö, Juha Kiviluoma

Fast and accurate large-scale energy system models are needed to investigate the potential of storage to complement the fluctuating energy production of renewable energy systems. However, standard Mixed-Integer Programming (MIP) models that describe optimal investment and operation of these storage units, including the optional capacity to provide up/down reserves, do not scale well. To improve scalability, the integrality constraints are often relaxed, resulting in Linear Programming (LP) relaxations that allow simultaneous charging and discharging, while this is not feasible in practice. To address this, we derive the convex hull of the solutions for the optimal operation of storage for one time period, as well as for problems including investments and reserves, guaranteeing that no tighter MIP formulation or better LP approximation exists for one time period. When incorporating this convex hull into a multi-period formulation and including it in large-scale energy system models, the improved LP relaxations can better prevent simultaneous charging and discharging, and the tighter MIP could positively affect the solving time. We demonstrate this with illustrative case studies of a unit commitment problem and a transmission expansion planning problem. ...

When effort may fail

Equilibria of shared effort with a threshold

Journal article (2026) - Gleb Polevoy, Stojan Trajanovski, Mathijs de Weerdt

People, robots, and companies mostly divide time and effort among projects, and shared effort games model people investing resources in public endeavours and sharing the generated values. In linear θ sharing (effort) games, a project’s value is linear in the total contribution, thus modelling predictable, uniform, and scalable activities. The threshold θ for effort defines which contributors win and receive their share, equal share modelling standard salaries, equity-minded projects, etc. Thresholds between 0 and 1 model games such as paper co-authorship and shared assignments, where a minimum positive contribution is required for sharing in the value. We constructively characterise the conditions for the existence of a pure equilibrium for θ∈{0,1}, and for two-player games with a general threshold, and find the prices of anarchy and stability. We also provide existence and efficiency results for more than two players, and use generalised fictitious play simulations to show when a pure equilibrium exists and what its efficiency is. We propose a method for studying solution concepts by refining a solution concept and finding a large natural subclass of games where the refinement coincides with the original solution concept (Nash, in this case). This means that the original concept narrows down to a more demanding concept on certain games, providing new insights for comparing both concepts. We also prove mixed equilibria always exist and bound their efficiency. ...

Constraint programming for scheduling a fermentation plant

Journal article (2026) - Kim van den Houten, Eva Christopoulou, Esteban Freydell, David M.J. Tax, Mathijs de Weerdt

This study examines the scheduling challenges faced by a real-world biomanufacturing site. This fermentation factory is a multi-product, multi-stage facility where batches are scheduled to meet customer demand. Importantly, in the production process of these batches, they are split and mixed into subbatches. Between stages, the intermediate product’s expiration should be avoided by maintaining a continuous flow without waiting in the factory, resulting in zero-wait policies. Sequence-dependent cleaning operations are also required between processing steps to avoid cross-contamination. Recent advances in constraint programming (CP) have demonstrated strong performance on standard scheduling problems. In this work, we investigate how effectively the biomanufacturing scheduling problems can be modeled with CP. Then, we investigate the performance of state-of-the-art CP solvers for solving real, large-scale biomanufacturing scheduling instances. Our empirical evaluation shows that CP Optimizer is more effective than Google OR Tools CP SAT. We show that a warm-start strategy based on domain knowledge improves the performance of the CP approach. Our empirical results show that relaxing the zero-wait constraints results in lower optimality gaps. Finally, we make our set of problem instances and CP model publicly available, thereby extending the scheduling literature with large-scale, realistic benchmarking instances obtained by our industrial collaboration. ...

Multi-Agent Pathfinding for Railway Routing

Abstract (2025) - Issa Hanou, Mathijs de Weerdt

Research in railway operations has mostly focused on operations research methods. However, these real-world problems have a state-based nature, which makes them very suitable for AI models, such as the Multi-Agent Pathfinding problem, where agents move in a grid and need to be routed from their start to their goal location without colliding with each other. The core aspect of problems like train shunting and train dispatching is routing, which is often not the main focus of current mathematical formulations. Therefore, we apply the state-of-the-art algorithms to the railway problems of shunting and dispatching and study their usability for routing trains. The Multi-Agent Pathfinding problem is often solved with one of two algorithms: conflict-based search (a two-stage algorithm detecting conflicts between individual paths and using A* search to find new conflict-free paths), and branch-cut-and-price (a linear program adding cuts (row generation) based on problem-specific constraints, and finding new paths to be selected that satisfy all constraints using a pricer). We modify these algorithms to include more railway details. First, we allow for the matching of train units (i.e., ensure the necessary train units of a certain type are available for departure) by specifying goals for agent (type) groups instead of single agent goals. Moreover, we add goal sequences for servicing stations and agents of different sizes, and we study specific aspects of the railway infrastructure to exploit in the algorithm. Finally, we show the use of Multi-Agent Pathfinding solvers in different railway settings and analyze the conditions for success. ...

Performance and interaction assessment of neural network architectures and bivariate smart predict-then-optimize

Journal article (2025) - Junhan Wen, Thomas Abeel, Mathijs de Weerdt

Smart “predict, then optimize” (SPO) (Elmachtoub in Manag Sci 68(1): 9–26, 2022) is an end-to-end learning strategy for models that predict parameters in optimization problems. Unlike minimizing mean squared error (MSE) which cares about prediction accuracies, SPO aims to ensure that predictions lead to the best possible decisions. The associated loss function, termed SPO loss, measures the decision’s regret from optimal outcomes with parameter realizations. Existing literature has demonstrated the viability of SPO, however, these studies often focus on classical optimization problems and employ a limited set of models for benchmarking. In this study, we tackled a decision-making task inspired by real-world challenges across a wide range of neural network models. Unlike classical problems, our task requires a unique approach: collaboratively training two models to predict different variables. On top of that, one of the decision variables also affects the feasibility of the decisions, further increasing the complexity. While our implementation validates the benefits of SPO, we were surprised to find that models trained exclusively on SPO loss do not consistently attain the minimum regret. Our further investigation into hyperparameters illustrates that the well-tuned models learned very similar patterns from the feature set, irrespective of whether MSE or SPO loss was used. In other words, the change from MSE to SPO loss in training primarily affected the layer biases. Therefore, to improve the learning efficacy with SPO loss, we propose prioritizing learning feature patterns as the fundamental step. Possible strategies include using specialized neural network layers to capture deeper patterns more effectively or simply warming up by training with MSE. Specifically, a warming-up process is particularly advantageous for model(s) where the outputs are closely tied to constraints, as their prediction accuracy significantly impacts the decision feasibility. The insights are investigated empirically through two real-world trading scenarios. By leveraging datasets with diverse properties, we demonstrate the novelty and generalizability of our investigation. ...

Smart “predict, then optimize” (SPO) (Elmachtoub in Manag Sci 68(1): 9–26, 2022) is an end-to-end learning strategy for models that predict parameters in optimization problems. Unlike minimizing mean squared error (MSE) which cares about prediction accuracies, SPO aims to ensure that predictions lead to the best possible decisions. The associated loss function, termed SPO loss, measures the decision’s regret from optimal outcomes with parameter realizations. Existing literature has demonstrated the viability of SPO, however, these studies often focus on classical optimization problems and employ a limited set of models for benchmarking. In this study, we tackled a decision-making task inspired by real-world challenges across a wide range of neural network models. Unlike classical problems, our task requires a unique approach: collaboratively training two models to predict different variables. On top of that, one of the decision variables also affects the feasibility of the decisions, further increasing the complexity. While our implementation validates the benefits of SPO, we were surprised to find that models trained exclusively on SPO loss do not consistently attain the minimum regret. Our further investigation into hyperparameters illustrates that the well-tuned models learned very similar patterns from the feature set, irrespective of whether MSE or SPO loss was used. In other words, the change from MSE to SPO loss in training primarily affected the layer biases. Therefore, to improve the learning efficacy with SPO loss, we propose prioritizing learning feature patterns as the fundamental step. Possible strategies include using specialized neural network layers to capture deeper patterns more effectively or simply warming up by training with MSE. Specifically, a warming-up process is particularly advantageous for model(s) where the outputs are closely tied to constraints, as their prediction accuracy significantly impacts the decision feasibility. The insights are investigated empirically through two real-world trading scenarios. By leveraging datasets with diverse properties, we demonstrate the novelty and generalizability of our investigation.

The maintenance scheduling and location choice problem for railway rolling stock

Journal article (2025) - Jordi Zomer, Nikola Bešinović, Mathijs M. de Weerdt, Rob M.P. Goverde

The increasing train traffic over railway networks stretches the demand for capacity of railway yards and rolling stock maintenance locations, which increasingly limits performance and further growth. Therefore, the scheduling of rolling stock maintenance and the choice regarding optimal locations to perform maintenance is increasingly complicated. This research introduces a Maintenance Scheduling and Location Choice Problem (MSLCP). It simultaneously determines maintenance locations and maintenance schedules of rolling stock, while considering the available capacity of maintenance locations. Solving the MSLCP using one large Mixed Integer Programming appears not to perform well enough. Therefore, to solve the MSLCP, an optimization framework based on Logic-Based Benders’ Decomposition (LBBD) is proposed by combining two models, the Maintenance Location Choice Problem (MLCP) and the Activity Planning Problem (APP), to assess the capacity of an MLCP solution. Within the LBBD, four variants of cut generation procedures are introduced to improve the computational performance: a naive procedure, two heuristic procedures and the so-called min-cut procedure that aims to exploit the specific characteristics of the problem at hand. The framework is demonstrated on realistic scenarios from the Dutch railways. It is shown that the best choice for the cut generation procedure depends on the objective: when aiming to find a good but not necessarily optimal solution, the min-cut procedure performs best, whereas when aiming for the optimal solution, one of the heuristic procedures is the preferred option. The techniques used in the current research are new to the current field and offer interesting next research opportunities. ...

Increasing the Capacity of Shunting Yards Within the Current Infrastructure

A Computational Perspective

Book chapter (2025) - Issa K. Hanou, Sebastijan Dumančić, Mathijs de Weerdt, Paul van der Voort, Roel van den Broek, Marjan van den Akker

With a dense infrastructure and limited space, the opportunities for increasing the capacity of the railway network in the Netherlands are limited. One of the bottlenecks is optimally using the available space around stations and in shunting yards. Many details must be considered, increasing the complexity of the problem. Human planners can benefit from computational support to ensure efficient use of the infrastructure. We introduce a framework for positioning previous research in terms of abstractions and highlight a promising future direction: the development of a new approach that combines different methods and uses the relations between the abstractions to create more efficient solutions. ...

EnergySHR

A platform for energy dataset sharing and communications

Conference paper (2025) - Zaman Ziabakhshganji, Mathijs de Weerdt, Sreeparna Deb, Caroline Duterloo, Yashar Ghiassi-Farrokhfal, Doron Gollnast, Jhon Jairo Quinones-Cortes, Alicia Julia Wilson Takaoka, Simon Tindemans, More authors...

Because the energy transition is a critical and urgent issue that is increasingly reliant on data, the Center for Energy System Intelligence (CESI), a Convergence collaboration between TU Delft and Erasmus University Rotterdam, has developed a platform where researchers on the energy transition can share, publish, and/or find energy-related datasets and algorithms: EnergySHR. This platform aims to accelerate energy transition research into intelligent, data-driven algorithms. In this demonstration, we present the EnergySHR platform as both a platform for storing, accessing, managing, and archiving datasets as well as a tool to conduct empirical research about platformization and data-driven decision-making about the energy transition. ...

Tight and Compact MILP Models for Storage Operation and Unit Commitment

Conference paper (2025) - Maaike B. Elgersma, Germán A. Morales-España, Karen I. Aardal, Mathijs M. de Weerdt

To achieve climate goals by 2050, accurate energy system optimization (MIP) models are needed to help decision-makers make investment plans. To increase accuracy, a high resolution in the temporal and spatial dimensions is needed, as well as many details on the operational capabilities of energy generators. However, this results in large-scale models that do not scale well. Thus, researchers often seek the right trade-off between computational tractability and accuracy. Here, we present a tighter formulation for optimal storage operation and investment problems, including reserves, along with the methodology we used to obtain it, based on the work of [2]. Additionally, we present some preliminary work aiming to provide tight and compact unit commitment models with different levels of detail. These models can be included in large-scale energy system optimization models to increase model accuracy while keeping the models computationally tractable. ...

Proactive and Reactive Constraint Programming for Stochastic Project Scheduling with Maximal Time-Lags

Journal article (2025) - Kim van den Houten, Léon Planken, Esteban Freydell, David M.J. Tax, Mathijs de Weerdt

This study investigates scheduling strategies for the stochastic resource-constrained project scheduling problem with maximal time lags (SRCPSP/max). Recent advances in Constraint Programming (CP) and Temporal Networks have re-invoked interest in evaluating the advantages and drawbacks of various proactive and reactive scheduling methods. First, we present a new, CP-based fully proactive method. Second, we show how a reactive approach can be constructed using an online rescheduling procedure. A third contribution is based on partial order schedules and uses Simple Temporal Networks with Uncertainty (STNUs). Our statistical analysis shows that the STNU-based algorithm performs best in terms of solution quality, while also showing good relative offline and online computation time. ...

Exposing a locational energy market to uncertainty

Journal article (2025) - Longjian Piao, Laurens de Vries, Mathijs de Weerdt, Neil Yorke-Smith

Future energy markets for low voltage AC and DC distribution systems will facilitate prosumer participation in the market. To comply with market regulations and grid constraints, a tailored market design reflecting (DC) operational requirements is needed. Our previous work identified a locational energy market design. However, its real-life implementation faces challenges due to uncertainties in system operation, prosumer preferences, and bidding strategies. This article tests the market design under uncertain scenarios. To this end, we develop an agent-based model that simulates typical electric vehicle user preferences and bidding strategies, influenced by varying degrees of range anxiety. The market design is tested in challenging scenarios with a high share of solar panels and electric vehicles, modelled using the high-resolution Pecan Street database. Simulations indicate that the proposed market design maintains both economic efficiency and system reliability under real-life uncertainties. This in turn indicates the practical feasibility of locational energy markets in helping to integrate renewable generation sources and bidirectional power flows. ...

The Research Project in Computer Science Bachelor Education

Undergraduate Research Experience at Scale

Conference paper (2025) - Gosia Migut, Aleksander Buszydlik, Mathijs M. De Weerdt

Exposure to research is an important component of undergraduate university education, cultivating critical thinking, problem-solving, and preparation for advanced study. However, providing individual research experiences for large cohorts of undergraduate students poses significant logistical challenges. This paper demonstrates how an undergraduate research experience can be achieved at scale for a large computer science program. Our approach integrates individual research projects into the undergraduate computer science curriculum for up to almost 400 students within a single 10-week course. We describe three key features of our approach: (1) a matching algorithm that assigns students to research projects based on their preferences, (2) peer-group collaboration, and (3) a distributed supervision and assessment model to guide students through key research activities that include reformulating research questions, designing experiments/user studies, and presenting research. Results and feedback indicate that both students and supervisors are satisfied, demonstrating the feasibility and effectiveness of this scalable approach for integrating research experiences into large undergraduate computer science programs. ...

The Growing Strawberries Dataset

Tracking Multiple Objects with Biological Development over an Extended Period

Conference paper (2024) - Junhan Wen, Camiel R. Verschoor, Chengming Feng, Irina Mona Epure, Thomas Abeel, Mathijs De Weerdt

Multiple Object Tracking (MOT) is a rapidly developing research field that targets precise and reliable tracking of objects. Unfortunately, most available MOT datasets typically contain short video clips only, disregarding the indispensable requirement for adequately capturing substantial long-term variations in real-world scenarios. Long-term MOT poses unique challenges due to changes in both the objects and the environment, which remain relatively unexplored. To fill the gap, we propose a time-lapse image dataset inspired by the growth monitoring of strawberries, dubbed The Growing Strawberries Dataset (GSD). The data was captured hourly by six cameras, covering a span of 16 months in 2021 and 2022. During this time, it encompassed a total of 24 plants in two separate greenhouses. The changes in appearance, weight, and position during the ripening process, along with variations in the illumination during data collection, distinguish the task from previous MOT research. These practical issues resulted in a drastic performance downgrade in the track identification and association tasks of state-of-the-art MOT algorithms. We believe The Growing Strawberries will provide a platform for evaluating such long-term MOT tasks and inspire future research. The dataset is available at https://doi.org/10.4121/e3b31ece-cc88-4638-be10-8ccdd4c5f2f7.v1. ...

Replanning in Advance for Instant Delay Recovery in Multi-Agent Applications

Rerouting Trains in a Railway Hub

Conference paper (2024) - Issa K. Hanou, Devin Wild Thomas, Wheeler Ruml, Mathijs de Weerdt

Train routing is sensitive to delays that occur in the network. When a train is delayed, it is imperative that a new plan be found quickly, or else other trains may need to be stopped to ensure safety, potentially causing cascading delays. In this paper, we consider this class of multi-agent planning problems, which we call Multi-Agent Execution Delay Replanning. We show that these can be solved by reducing the problem to an any-start-time safe interval path planning problem. When an agent has an any-start-time plan, it can react to a delay by simply looking up the precomputed plan for the delayed start time. We identify crucial real-world problem characteristics like the agent's speed, size, and safety envelope, and extend the any-start-time planning to account for them. Experimental results on real-world train networks show that any-start-time plans are compact and can be computed in reasonable time while enabling agents to instantly recover a safe plan. ...

Learning from Scenarios for Repairable Stochastic Scheduling

Conference paper (2024) - Kim van den Houten, David M.J. Tax, Esteban Freydell, Mathijs de Weerdt

When optimizing problems with uncertain parameter values in a linear objective, decision-focused learning enables end-to-end learning of these values. We are interested in a stochastic scheduling problem, in which processing times are uncertain, which brings uncertain values in the constraints, and thus repair of an initial schedule may be needed. Historical realizations of the stochastic processing times are available. We show how existing decision-focused learning techniques based on stochastic smoothing can be adapted to this scheduling problem. We include an extensive experimental evaluation to investigate in which situations decision-focused learning outperforms the state of the art, i.e., scenario-based stochastic optimization. ...

Real-Time Data-Driven Maintenance Logistics

A Public-Private Collaboration

Conference paper (2024) - Willem van Jaarsveld, Laurens Bliek, Mathijs de Weerdt, Stella Kapodistria, Verus Pronk, Peter Verleijsdonk, Simon Voorberg, Sicco Verwer, Yingqian Zhang, More authors...

The project “Real-time data-driven maintenance logistics” was initiated with the purpose of bringing innovations in data-driven decision making to maintenance logistics, by bringing problem owners in the form of three innovative companies together with researchers at two leading knowledge institutions. This paper reviews innovations in three related areas: How the innovations were inspired by practice, how they materialized, and how the results impact practice. ...

Bayesian Meta-Reinforcement Learning with Laplace Variational Recurrent Networks

Preprint (2024) - Joery A de Vries, Jinke He, Mathijs de Weerdt, Matthijs TJ Spaan

Paths, Proofs, and Perfection

Developing a Human-Interpretable Proof System for Constrained Shortest Paths

Journal article (2024) - Konstantin Sidorov, Gonçalo Homem de Almeida Correia, Mathijs de Weerdt, Emir Demirović

People want to rely on optimization algorithms for complex decisions but verifying the optimality of the solutions can then become a valid concern, particularly for critical decisions taken by non-experts in optimization. One example is the shortest-path problem on a network, occurring in many contexts from transportation to logistics to telecommunications. While the standard shortest-path problem is both solvable in polynomial time and certifiable by duality, introducing side constraints makes solving and certifying the solutions much harder. We propose a proof system for constrained shortest-path problems, which gives a set of logical rules to derive new facts about feasible solutions. The key trait of the proposed proof system is that it specifically includes high-level graph concepts within its reasoning steps (such as connectivity or path structure), in contrast to using linear combinations of model constraints. Using our proof system, we can provide a step-by-step, human-auditable explanation showing that the path given by an external solver cannot be improved. Additionally, to maximize the advantages of this setup, we propose a proof search procedure that specifically aims to find small proofs of this form using a procedure similar to A* search. We evaluate our proof system on constrained shortest path instances generated from real-world road networks and experimentally show that we may indeed derive more interpretable proofs compared to an integer programming approach, in some cases leading to much smaller proofs. ...

To the Max

Reinventing Reward in Reinforcement Learning

Journal article (2024) - Grigorii Veviurko, Wendelin Böhmer, Mathijs de Weerdt

In reinforcement learning (RL), different reward functions can define the same optimal policy but result in drastically different learning performance. For some, the agent gets stuck with a suboptimal behavior, and for others, it solves the task efficiently. Choosing a good reward function is hence an extremely important yet challenging problem. In this paper, we explore an alternative approach for using rewards for learning. We introduce max-reward RL, where an agent optimizes the maximum rather than the cumulative reward. Unlike earlier works, our approach works for deterministic and stochastic environments and can be easily combined with state-of-the-art RL algorithms. In the experiments, we study the performance of max-reward RL algorithms in two goal-reaching environments from Gymnasium-Robotics and demonstrate its benefits over standard RL. The code is available at https://github.com/veviurko/To-the-Max. ...

Benchmarking surrogate-based optimisation algorithms on expensive black-box functions

Journal article (2023) - Laurens Bliek, Arthur Guijt, Rickard Karlsson, Sicco Verwer, Mathijs de Weerdt

Surrogate algorithms such as Bayesian optimisation are especially designed for black-box optimisation problems with expensive objectives, such as hyperparameter tuning or simulation-based optimisation. In the literature, these algorithms are usually evaluated with synthetic benchmarks which are well established but have no expensive objective, and only on one or two real-life applications which vary wildly between papers. There is a clear lack of standardisation when it comes to benchmarking surrogate algorithms on real-life, expensive, black-box objective functions. This makes it very difficult to draw conclusions on the effect of algorithmic contributions and to give substantial advice on which method to use when. A new benchmark library, EXPObench, provides first steps towards such a standardisation. The library is used to provide an extensive comparison of six different surrogate algorithms on four expensive optimisation problems from different real-life applications. This has led to new insights regarding the relative importance of exploration, the evaluation time of the objective, and the used model. We also provide rules of thumb for which surrogate algorithm to use in which situation. A further contribution is that we make the algorithms and benchmark problem instances publicly available, contributing to more uniform analysis of surrogate algorithms. Most importantly, we include the results of the six algorithms on all evaluated problem instances. This unique new dataset lowers the bar for researching new methods as the number of expensive evaluations required for comparison and for the creation of new surrogate models is significantly reduced. ...