MARL-iDR: Multi-Agent Reinforcement Learning for Incentive-based Residential Demand Response

van Tilburg, Jasper

MARL-iDR: Multi-Agent Reinforcement Learning for Incentive-based Residential Demand Response

Title

MARL-iDR: Multi-Agent Reinforcement Learning for Incentive-based Residential Demand Response

Author

van Tilburg, Jasper (TU Delft Electrical Engineering, Mathematics and Computer Science)

Contributor

Cavalcante Siebert, L. (mentor)
Cremer, Jochen (mentor)

Degree granting institution

Delft University of Technology

Date

2021-11-30

Abstract

Distribution System Operators (DSOs) are responsible preventing grid congestion, while accounting for growing demand and the intermittent nature of renewable energy resources. Incentive-based demand response programs promise real-time flexibility to relieve grid congestion. To include residential consumers in these programs, aggregators can financially incentivize participants to reduce their energy demand and make aggregated energy reduction available to DSOs. A key challenge for aggregators is to coordinate heterogeneous preferences from multiple participants while preserving their privacy. This thesis proposes MARL-iDR: a decentralized Multi-Agent Reinforcement Learning approach to an incentive-based demand response program. The approach respects participants' privacy and preferences and makes decisions in real-time when deployed. The aggregator and each participant are controlled by Deep Reinforcement Learning agents that learn to maximize their reward. The aggregator agent learns a policy that dispatches suitable incentives to participants based on total energy demand and a target reduction, while minimizing financial costs. The participant agent learns to respond to these incentives by reducing consumption to a fraction of the original demand. The participant agents curtail or shift requested household appliances based on the selected consumption reduction using a novel Disjunctively Constrained Knapsack Problem optimization, while minimizing residents' dissatisfaction. A case study with real-world electricity data from 25 households demonstrates the capability to induce demand-side flexibility. The approach is compared to the case without demand response and to a centralized myopic baseline approach. A 9% reduction of the Peak-to-Average ratio (PAR) was achieved compared to the original PAR (no demand response).

Subject

Reinforcement Learning (RL)
Demand Response
Multi-Agent systems

To reference this document use:

http://resolver.tudelft.nl/uuid:4080723c-1e96-4a03-a40c-d7b74254e9d0

Part of collection

Student theses

Document type

master thesis

Rights

Files

PDF

Master_Thesis_JaspervanTilburg.pdf

5.57 MB

Close viewer