Reinforcement Learning based Energy Management System for Smart Buildings

Master Thesis (2022)
Author(s)

N. van den Bovenkamp (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Pedro Pablo Vergara – Mentor (TU Delft - Intelligent Electrical Power Grids)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2022 Nick van den Bovenkamp
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 Nick van den Bovenkamp
Graduation Date
14-03-2022
Awarding Institution
Delft University of Technology
Programme
['Computer Science']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Smart buildings, including photovoltaic (PV) generation, controllable electricity consumption, and a battery energy storage system, are expected to fulfill a crucial role in balancing out supply and demand in future power systems.
Energy management systems (EMS) present in these smart buildings control the operation of these various components.
Achieving an optimal dynamic control strategy is still challenging due to the stochastic nature of PV generation, electricity consumption patterns, and market prices.
Hence, this research developed an EMS that minimizes day-ahead electricity costs based on reinforcement learning (RL) with linear function approximation.
The proposed Q-learning with tile coding (QLTC) EMS is compared to the solutions found by the deterministic mixed-integer linear programming (MILP) model, which is needed to validate if the proposed approach reaches good-quality solutions. Furthermore, the QLTC's generalization capabilities are evaluated, a missing feature in literate.
A case study on an industrial manufacturing company in the Netherlands with historical electricity consumption, PV generation, and wholesale electricity prices is carried out to examine the QLTC EMS's performance.
The results show that the QLTC's returns convergence consistently to the MILP's negative electricity costs, indicating that the QLTC reaches a good-quality control policy.
The EMS effectively adjusts its power consumption to favorable price moments during one week of operation, where the electricity costs made by the QLTC comes 99\% close to MILP's electricity costs.
Furthermore, the results demonstrate that the QLTC approach can deploy a decent control policy without encountering the exact day of data by generalizing on previously learned control policies.
On average, it can deploy a control policy of 80\% near the MILP's optimum on a test week after being trained for 85 days of data.

Files

License info not available