Non-stationarity in multiagent reinforcement learning in electricity market simulation

More Info
expand_more

Abstract

The design of electricity markets may be facilitated by simulating actors’ behaviors. Recent studies model human decision-makers within markets as agents which learn strategies that maximize expected profits. This work investigates the problem of ‘non-stationarity’ in the context of market simulations, a problem with the learning-algorithms used by such studies which results in agents behaving irrationally, thus limiting the studies’ applicability to real-world strategic behavior. Isolating the source of the problem for a day-ahead electricity market, this paper proposes methods which meliorate this problem in simple test-cases, and proves requirements under which ‘centralized-training, decentralized-execution’ value-learning methods will converge to correct behavior in general. Subsequently, this paper proposes a framework for ‘adversarial market design’ that includes the market-designer as an agent. This allows the optimization of market-designs subject to possibly strategic behavior of participating firms — in turn enabling the automated selection of the optimal market from any set of markets.