Poincaré-Bendixson Limit Sets in Multi-Agent Learning

Conference Paper (2022)
Author(s)

Aleksander Czechowski (TU Delft - Interactive Intelligence)

Georgios Piliouras (Singapore University of Technology and Design)

Research Group
Interactive Intelligence
Copyright
© 2022 A.T. Czechowski, Georgios Piliouras
More Info
expand_more
Publication Year
2022
Language
English
Copyright
© 2022 A.T. Czechowski, Georgios Piliouras
Research Group
Interactive Intelligence
Pages (from-to)
318-326
ISBN (electronic)
978-171385433-3
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

A key challenge of evolutionary game theory and multi-agent learning is to characterize the limit behavior of game dynamics. Whereas convergence is often a property of learning algorithms in games satisfying a particular reward structure (e.g., zero-sum games), even basic learning models, such as the replicator dynamics, are not guaranteed to converge for general payoffs. Worse yet, chaotic behavior is possible even in rather simple games, such as variants of the Rock-Paper-Scissors game. Although chaotic behavior in learning dynamics can be precluded by the celebrated Poincaré-Bendixson theorem, it is only applicable to low-dimensional settings. Are there other characteristics of a game that can force regularity in the limit sets of learning? We show that behavior consistent with the Poincaré-Bendixson theorem (limit cycles, but no chaotic attractor) can follow purely from the topological structure of the interaction graph, even for high-dimensional settings with an arbitrary number of players and arbitrary payoff matrices. We prove our result for a wide class of follow-the-regularized leader (FoReL) dynamics, which generalize replicator dynamics, for binary games characterized interaction graphs where the payoffs of each player are only affected by one other player (i.e., interaction graphs of indegree one). Since chaos occurs already in games with only two players and three strategies, this class of non-chaotic games may be considered maximal. Moreover, we provide simple conditions under which such behavior translates into efficiency guarantees, implying that FoReL learning achieves time-averaged sum of payoffs at least as good as that of a Nash equilibrium, thereby connecting the topology of the dynamics to social-welfare analysis.

Files

3535850.3535887.pdf
(pdf | 1.46 Mb)
- Embargo expired in 01-07-2023
License info not available