Large-Scale Wildfire Mitigation Through Deep Reinforcement Learning

Altamimi, Abdulelah; Lagoa, Constantino; Borges, José G.; McDill, Marc E.; Andriotis, C.; Papakonstantinou, K. G.

doi:10.3389/ffgc.2022.734330

Large-Scale Wildfire Mitigation Through Deep Reinforcement Learning

Title

Large-Scale Wildfire Mitigation Through Deep Reinforcement Learning

Author

Altamimi, Abdulelah (The Pennsylvania State University)
Lagoa, Constantino (The Pennsylvania State University)
Borges, José G. (University of Lisbon)
McDill, Marc E. (The Pennsylvania State University)
Andriotis, C. (TU Delft Structural Design & Mechanics)
Papakonstantinou, K. G. (The Pennsylvania State University)

Date

2022

Abstract

Forest management can be seen as a sequential decision-making problem to determine an optimal scheduling policy, e.g., harvest, thinning, or do-nothing, that can mitigate the risks of wildfire. Markov Decision Processes (MDPs) offer an efficient mathematical framework for optimizing forest management policies. However, computing optimal MDP solutions is computationally challenging for large-scale forests due to the curse of dimensionality, as the total number of forest states grows exponentially with the numbers of stands into which it is discretized. In this work, we propose a Deep Reinforcement Learning (DRL) approach to improve forest management plans that track the forest dynamics in a large area. The approach emphasizes on prevention and mitigation of wildfire risks by determining highly efficient management policies. A large-scale forest model is designed using a spatial MDP that divides the square-matrix forest into equal stands. The model considers the probability of wildfire dependent on the forest timber volume, the flammability, and the directional distribution of the wind using data that reflects the inventory of a typical eucalypt (Eucalyptus globulus Labill) plantation in Portugal. In this spatial MDP, the agent (decision-maker) takes an action at one stand at each step. We use an off-policy actor-critic with experience replay reinforcement learning approach to approximate the MDP optimal policy. In three different case studies, the approach shows good scalability for providing large-scale forest management plans. The results of the expected return value and the computed DRL policy are found identical to the exact optimum MDP solution, when this exact solution is available, i.e., for low dimensional models. DRL is also found to outperform a genetic algorithm (GA) solutions which were used as benchmarks for large-scale model policy.

Subject

deep reinforcement learning
dynamic programming
forest management
Markov Decision Process
wildfire mitigation

To reference this document use:

http://resolver.tudelft.nl/uuid:f4cc2b8f-b805-4d11-96e6-6a61a19037eb

DOI

https://doi.org/10.3389/ffgc.2022.734330

Source

Frontiers in Forests and Global Change, 5

Part of collection

Institutional Repository

Document type

journal article

Rights

Files

PDF

ffgc_05_734330.pdf

2.35 MB

Close viewer