R Belmans | TU Delft Repository

Residential demand response of thermostatically controlled loads using batch Reinforcement Learning

Journal article (2017) - F Ruelens, BJ Claessens, S Vandael, Bart De Schutter, Robert Babuska, R Belmans

Driven by recent advances in batch Reinforcement Learning (RL), this paper contributes to the application of batch RL to demand response. In contrast to conventional model-based approaches, batch RL techniques do not require a system identification step, making them more suitable for a large-scale implementation. This paper extends fitted Q-iteration, a standard batch RL technique, to the situation when a forecast of the exogenous data is provided. In general, batch RL techniques do not rely on expert knowledge about the system dynamics or the solution. However, if some expert knowledge is provided, it can be incorporated by using the proposed policy adjustment method. Finally, we tackle the challenge of finding an open-loop schedule required to participate in the day-ahead market. We propose a model-free Monte Carlo method that uses a metric based on the state-action value function or Q-function and we illustrate this method by finding the day-ahead schedule of a heat-pump thermostat. Our experiments show that batch RL techniques provide a valuable alternative to model-based controllers and that they can be used to construct both closed-loop and open-loop policies. ...

Reinforcement learning applied to an electric water heater: From theory to practice

Journal article (2016) - Frederik Ruelens, BJ Claessens, S. Quaiyum, Bart De Schutter, Robert Babuska, R Belmans

Electric water heaters have the ability to store energy in their water buffer without impacting the comfort of the end user. This feature makes them a prime candidate for residential demand response. However, the stochastic and nonlinear dynamics of electric water heaters, makes it challenging to harness their flexibility. Driven by this challenge, this paper formulates the underlying sequential decision-making problem as a Markov decision process and uses techniques from reinforcement learning. Specifically, we apply an auto-encoder network to find a compact feature representation of the sensor measurements, which helps to mitigate the curse of dimensionality. A wellknown batch reinforcement learning technique, fitted Q-iteration, is used to find a control policy, given this feature representation. In a simulation-based experiment using an electric water heater with 50 temperature sensors, the proposed method was able to achieve good policies much faster than when using the full state information. In a lab experiment, we apply fitted Q-iteration to an electric water heater with eight temperature sensors. Further reducing the state vector did not improve the results of fitted Q-iteration. The results of the lab experiment, spanning 40 days, indicate that compared to a thermostat controller, the presented approach was able to reduce the total cost of energy consumption of the electric water heater by 15%. ...

Optimal phase shifter coordination: a multidimensional problem

Conference paper (2006) - J Verboomen, D van Hertem, R Belmans, PH Schavemaker, WL Kling

In a liberalized electricity market, the use of phase shift ing transformers or other power ¿ow controlling devices allows the transmission system operator to utilize the avail able grid infrastructure in a more optimal way. However, each phase shifter adds a degrees of freedom to the control problem, making optimization more dif¿cult. In this paper, the coordination problem is solved by using Particle Swarm Optimization (PSO). The goal is to give an overview of how PSO is used to solve this particular problem, and to demon strate a new area of application for the method. ...

Usefulness of DC power flow for active power flow analysis with flow controlling devices

Conference paper (2006) - D van Hertem, J Verboomen, K Purchala, R Belmans, WL Kling

DC power flow is a commonly used tool for contingency analysis. Recently, due to its simplicity and robustness, it also becomes increasingly used for the real-time dispatch and techno-economic analysis of power systems. It is a simplification of a full power flow looking only at active power. Aspects such as voltage support and reactive power management are possible to analyse. However, such simplifications cannot always be justified and sometimes lead to unrealistic results. Especially the implementation of power flow controlling devices is not trivial since standard DC power flow fundamentally neglects their effects. Until recently, this was not an issue as the application of power flow controlling devices in the European grid was limited. However, with the liberalisation of European electricity market and the introduction of large wind energy systems, the need for real power flow control has emerged and therefore, the use of these devices has been reconsidered. Several phase shifting transformers (PST) are being installed or planned in order to control flows. Therefore, it is important to fundamentally re-validate the fast, but less accurate, DC power flow method. In this paper the assumptions of DC power flow are analysed, and its validity is assessed by comparing the results of power flow simulations using both the DC and AC approaches on a modified IEEE 300 bus system with PSTs. ...