Unlocking the Flexibility of District Heating Pipeline Energy Storage with Reinforcement Learning

Journal Article (2022)
Author(s)

Ksenija Stepanovic (TU Delft - Algorithmics)

J. Wu (Flex Technologies, TU Delft - Algorithmics)

Rob Everhardt (Flex Technologies)

Mathijs M. de Weerdt (TU Delft - Algorithmics)

Research Group
Algorithmics
Copyright
© 2022 K. Stepanovic, J. Wu, Rob Everhardt, M.M. de Weerdt
DOI related publication
https://doi.org/10.3390/en15093290
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 K. Stepanovic, J. Wu, Rob Everhardt, M.M. de Weerdt
Related content
Research Group
Algorithmics
Issue number
9
Volume number
15
Pages (from-to)
1-25
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The integration of pipeline energy storage in the control of a district heating system can lead to profit gain, for example by adjusting the electricity production of a combined heat and power (CHP) unit to the fluctuating electricity price. The uncertainty from the environment, the computational complexity of an accurate model, and the scarcity of placed sensors in a district heating system make the operational use of pipeline energy storage challenging. A vast majority of previous works determined a control strategy by a decomposition of a mixed-integer nonlinear model and significant simplifications. To mitigate consequential stability, feasibility, and computational complexity challenges, we model CHP economic dispatch as a Markov decision process. We use a reinforcement learning (RL) algorithm to estimate the system’s dynamics through interactions with the simulation environment. The RL approach is compared with a detailed nonlinear mathematical optimizer on day-ahead and real-time electricity markets and two district heating grid models. The proposed method achieves moderate profit impacted by environment stochasticity. The advantages of the RL approach are reflected in three aspects: stability, feasibility, and time scale flexibility. From this, it can be concluded that RL is a promising alternative for real-time control of complex, nonlinear industrial systems.