Unlocking the Flexibility of District Heating Pipeline Energy Storage with Reinforcement Learning

Journal Article (2022)
Authors

K. Stepanovic (TU Delft - Algorithmics)

Jichen Wu (Flex Technologies, TU Delft - Algorithmics)

Rob Everhardt (Flex Technologies)

MM Weerdt (TU Delft - Algorithmics)

Research Group
Algorithmics
Copyright
© 2022 K. Stepanovic, J. Wu, Rob Everhardt, M.M. de Weerdt
To reference this document use:
https://doi.org/10.3390/en15093290
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 K. Stepanovic, J. Wu, Rob Everhardt, M.M. de Weerdt
Related content
Research Group
Algorithmics
Issue number
9
Volume number
15
Pages (from-to)
1-25
DOI:
https://doi.org/10.3390/en15093290
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The integration of pipeline energy storage in the control of a district heating system can lead to profit gain, for example by adjusting the electricity production of a combined heat and power (CHP) unit to the fluctuating electricity price. The uncertainty from the environment, the computational complexity of an accurate model, and the scarcity of placed sensors in a district heating system make the operational use of pipeline energy storage challenging. A vast majority of previous works determined a control strategy by a decomposition of a mixed-integer nonlinear model and significant simplifications. To mitigate consequential stability, feasibility, and computational complexity challenges, we model CHP economic dispatch as a Markov decision process. We use a reinforcement learning (RL) algorithm to estimate the system’s dynamics through interactions with the simulation environment. The RL approach is compared with a detailed nonlinear mathematical optimizer on day-ahead and real-time electricity markets and two district heating grid models. The proposed method achieves moderate profit impacted by environment stochasticity. The advantages of the RL approach are reflected in three aspects: stability, feasibility, and time scale flexibility. From this, it can be concluded that RL is a promising alternative for real-time control of complex, nonlinear industrial systems.