QRES-MARL

A Resilience-Based Multi-Agent Reinforcement Learning Framework for Post-Earthquake Recovery of Interdependent Infrastructures

Master Thesis (2025)
Author(s)

A. Mavrotas (TU Delft - Architecture and the Built Environment)

Contributor(s)

C. Andriotis – Mentor (TU Delft - Structures & Materials)

Simona Bianchi – Mentor (TU Delft - Structures & Materials)

Faculty
Architecture and the Built Environment
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
23-06-2025
Awarding Institution
Delft University of Technology
Programme
['Architecture, Urbanism and Building Sciences | Building Technology']
Faculty
Architecture and the Built Environment
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This thesis focuses on using MARL as a decision tool for post-earthquake repair scheduling of interdependent infrastructure. MARL is a multi-agent ML paradigm which combines traditional ML research and game-theoretical approaches. Given the relative increase in natural disaster frequency and the lack of available post-disaster tools such tools are crucial in increasing the climate resilience of cities. Given the stochastic nature of earthquake events and subsequent losses, MARL can be helpful in navigating this uncertainty and finding preferable joint policies.The methodology involves multi-scenario-based seismic hazard assessment, stochastic fragility modelling and prediction of several direct and indirect losses to aggregate them into a holistic community resilience metric. This is then used to compute the instantaneous and cumulative recovery resilience loss values. The tested approach uses two custom built test-beds of 4 and 30 components, and MARL is compared against baseline solvers, including random and importance-based policies. Value Decomposition Network with Parameter Sharing (𝑉𝐷𝑁 βˆ’ 𝑃𝑆), Q-Learning with Mixer Network and Parameter Sharing (𝑄𝑀𝐼𝑋 βˆ’ 𝑃𝑆), Deep Centralised Multi-Agent Actor Critic (𝐷𝐢𝑀𝐴𝐢) are the algorithms tested. 𝑉𝐷𝑁 and 𝑄𝑀𝐼𝑋 are shown to perform similarly to each other and sub-optimally relative to 𝐷𝐢𝑀𝐴𝐢. 𝐷𝐢𝑀𝐴𝐢 is shown to match importance-based policies when considering full recovery, but convincingly outperforms all other 𝐷𝑅𝐿 methods and importance-based policies when considering partial recovery. This shows that 𝐷𝐢𝑀𝐴𝐢 and 𝐷𝑅𝐿 more generally is effective at swift early recovery by prioritising components that contribute most to community functionality. ii

Files

License info not available
P5_Presentation.pdf
(pdf | 21.1 Mb)
License info not available