No-Regret Learning in Network Stochastic Zero-Sum Games

Journal Article (2026)
Author(s)

Shijie Huang (TU Delft - Mechanical Engineering)

Jinlong Lei (Tongji University)

Yiguang Hong (Tongji University)

Research Group
Team Sergio Grammatico
DOI related publication
https://doi.org/10.1007/s11424-026-5503-2 Final published version
More Info
expand_more
Publication Year
2026
Language
English
Research Group
Team Sergio Grammatico
Journal title
Journal of Systems Science and Complexity
Issue number
2
Volume number
39
Pages (from-to)
831-856
Downloads counter
10
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

No-regret learning has been widely used to compute a Nash equilibrium in two-person zero-sum games. However, there is still a lack of regret analysis for network stochastic zero-sum games, where players competing in two subnetworks only have access to some local information, and the cost functions are subject to stochastic uncertainty. Such a game model can be found in network interdiction problems, when a group of inspectors work together to detect a group of evaders. In this paper, the authors propose a distributed stochastic mirror descent (D-SMD) method, and establish the regret bounds O(T) and O(log T) in the expected sense for convex-concave and strongly convex-strongly concave costs, respectively. The proposed bounds match those of the best known first-order online optimization algorithms. The authors then prove the convergence of the time-averaged iterates of D-SMD to the set of Nash equilibria. Finally, the authors show that the actual iterates of D-SMD almost surely converge to the Nash equilibrium in the strictly convex-strictly concave setting.

Files

S11424-026-5503-2.pdf
(pdf | 0.84 Mb)
Taverne
warning

File under embargo until 28-10-2026