Uncovering Sequential Social Dilemmas in Multi-Agent Reinforcement Learning
Challenges and Strategies for Local Energy Communities
M.T. Okoń (TU Delft - Electrical Engineering, Mathematics and Computer Science)
L. Cavalcante Siebert – Mentor (TU Delft - Interactive Intelligence)
Jochen Cremer – Mentor (TU Delft - Intelligent Electrical Power Grids)
J Yang – Graduation committee member (TU Delft - Web Information Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
This thesis investigates the occurrence and mitigation of Sequential Social Dilemmas (SSDs) in Local Energy Communities (LECs) managed through Multi-agent Reinforcement Learning (MARL). LECs have great potential as pivotal elements in the green energy transition, yet the inherent conflict between individual incentives and community-wide objectives creates SSD scenarios that challenge learning processes. To address these issues, we propose an agent-centric approach and develop a custom MARL environment where agents interact via a communal battery system and a local trading mechanism.
We systematically investigate the impact of resource constraints and social interactions on the agents' learning. In non-cooperative settings, limited resources impede policy optimization, while the introduction of a shared battery reveals SSD dynamics driven by both greed and fear factors. Our experiments show that rescaling the training data leads agents to adopt more cooperative behaviors, and that reward function modifications incentivizing community-friendly battery use cause a significant increase in social welfare. These mitigation techniques are further validated in a realistic LEC environment with multiple, heterogeneous households engaging in trading and storage actions.
The contributions of this thesis are threefold: (1) the proposal of a new agent-centric MARL environment for LECs, (2) the demonstration of SSDs impacting MARL performance in these decentralized energy systems, and (3) the introduction of concrete strategies for aligning individual and community incentives.