A. Villar Montero

info

Please Note

<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>

Master thesis (1)

1 records found

Constraint- and Heuristic-Aware Deep Reinforcement Learning for Infrastructure Intervention Planning

Master thesis (2026) - A. Villar Montero , O. Kammouh , C. Andriotis , Christos Lathourakis , Andrés Martínez Colán

As urban infrastructure continues to age, municipalities face increasingly complex challenges in maintaining critical assets under financial, safety, and societal constraints. In Amsterdam, more than 200 km of historic quay walls, many of them several centuries old, require carefully coordinated interventions to prevent structural failure while limiting urban disruption and long-term costs. Traditional maintenance planning approaches are typically reactive or rule-based, and struggle to account for stochastic degradation processes and the non-linear interaction costs associated with simultaneous street and traffic closures in dense urban networks.
This thesis proposes a multi-agent Deep Reinforcement Learning (DRL) framework for generating optimal, constraint-aware maintenance policies. The framework combines DRL with structural risk assessments and network-level societal impacts quantified through the Urban Strategy digital twin. To reflect the operational realities of public asset management, multiple constraint-handling mechanisms are integrated into the learning process, ranging from hard, component-level constraints to soft, network-level constraints, jointly capturing financial, safety, and societal considerations.
Experimental results show that the DRL agent consistently converges to optimal or nearoptimal policies while satisfying both hard and soft operational constraints, and significantly outperforms existing reactive maintenance strategies. Benchmarking against classical optimization methods demonstrates that the DRL approach maintains near-optimal performance as the problem scales to higher-dimensional environments where traditional methods become computationally intractable. Furthermore, warm-start initialization via curriculum learning is shown to substantially improve training stability, convergence speed, and solution quality in constrained settings.
Overall, this research demonstrates that constraint-aware DRL can serve as a powerful and flexible decision-support tool for urban infrastructure management. By bridging the gap between theoretical optimization and real-world operational constraints, the proposed framework provides a scalable foundation for future applications in city-scale asset management and related domains.