Poincaré-Bendixson Limit Sets in Multi-Agent Learning

None, None; None, None

Poincaré-Bendixson Limit Sets in Multi-Agent Learning

Conference Paper (2022)

Author(s)

Aleksander Czechowski (TU Delft - Interactive Intelligence)

Georgios Piliouras (Singapore University of Technology and Design)

Research Group

Interactive Intelligence

Copyright

Follow-the-Regularized Leader Poincaré-Bendixson Theorem Polymatrix Games Regret Minimization Replicator Dynamics

To reference this document use:

https://resolver.tudelft.nl/uuid:4b786896-7b6a-4283-bdb4-55ec5fb7142d

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Research Group

Interactive Intelligence

Pages (from-to)

318-326

ISBN (electronic)

978-171385433-3

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

A key challenge of evolutionary game theory and multi-agent learning is to characterize the limit behavior of game dynamics. Whereas convergence is often a property of learning algorithms in games satisfying a particular reward structure (e.g., zero-sum games), even basic learning models, such as the replicator dynamics, are not guaranteed to converge for general payoffs. Worse yet, chaotic behavior is possible even in rather simple games, such as variants of the Rock-Paper-Scissors game. Although chaotic behavior in learning dynamics can be precluded by the celebrated Poincaré-Bendixson theorem, it is only applicable to low-dimensional settings. Are there other characteristics of a game that can force regularity in the limit sets of learning? We show that behavior consistent with the Poincaré-Bendixson theorem (limit cycles, but no chaotic attractor) can follow purely from the topological structure of the interaction graph, even for high-dimensional settings with an arbitrary number of players and arbitrary payoff matrices. We prove our result for a wide class of follow-the-regularized leader (FoReL) dynamics, which generalize replicator dynamics, for binary games characterized interaction graphs where the payoffs of each player are only affected by one other player (i.e., interaction graphs of indegree one). Since chaos occurs already in games with only two players and three strategies, this class of non-chaotic games may be considered maximal. Moreover, we provide simple conditions under which such behavior translates into efficiency guarantees, implying that FoReL learning achieves time-averaged sum of payoffs at least as good as that of a Nash equilibrium, thereby connecting the topology of the dynamics to social-welfare analysis.

Files

3535850.3535887.pdf

(pdf | 1.46 Mb)

- Embargo expired in 01-07-2023

License info not available