Reinforcement Learning based Energy Management System for Smart Buildings

van den Bovenkamp, N.

Reinforcement Learning based Energy Management System for Smart Buildings

Master thesis (2022)

Authors

N. van den Bovenkamp Electrical Engineering, Mathematics and Computer Science

Contributors

P.P. Vergara Intelligent Electrical Power Grids (mentor)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

To reference this document use:

http://resolver.tudelft.nl/uuid:b44027f9-acf7-443b-8523-0b2283539952

More Info

expand_more

Published Date

14-03-2022

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Smart buildings, including photovoltaic (PV) generation, controllable electricity consumption, and a battery energy storage system, are expected to fulfill a crucial role in balancing out supply and demand in future power systems.
Energy management systems (EMS) present in these smart buildings control the operation of these various components.
Achieving an optimal dynamic control strategy is still challenging due to the stochastic nature of PV generation, electricity consumption patterns, and market prices.
Hence, this research developed an EMS that minimizes day-ahead electricity costs based on reinforcement learning (RL) with linear function approximation.
The proposed Q-learning with tile coding (QLTC) EMS is compared to the solutions found by the deterministic mixed-integer linear programming (MILP) model, which is needed to validate if the proposed approach reaches good-quality solutions. Furthermore, the QLTC's generalization capabilities are evaluated, a missing feature in literate.
A case study on an industrial manufacturing company in the Netherlands with historical electricity consumption, PV generation, and wholesale electricity prices is carried out to examine the QLTC EMS's performance.
The results show that the QLTC's returns convergence consistently to the MILP's negative electricity costs, indicating that the QLTC reaches a good-quality control policy.
The EMS effectively adjusts its power consumption to favorable price moments during one week of operation, where the electricity costs made by the QLTC comes 99\% close to MILP's electricity costs.
Furthermore, the results demonstrate that the QLTC approach can deploy a decent control policy without encountering the exact day of data by generalizing on previously learned control policies.
On average, it can deploy a control policy of 80\% near the MILP's optimum on a test week after being trained for 85 days of data.

Files

Thesis_Report_Nick_van_den_Bov... (pdf)

(pdf | 21.6 Mb)

Unknown license