No-Regret Learning in Network Stochastic Zero-Sum Games

Journal Article (2026)
Author(s)

Shijie Huang (TU Delft - Mechanical Engineering)

Jinlong Lei (Tongji University)

Yiguang Hong (Tongji University)

Research Group
Team Sergio Grammatico
DOI related publication
https://doi.org/10.1007/s11424-026-5503-2 Final published version
More Info
expand_more
Publication Year
2026
Language
English
Research Group
Team Sergio Grammatico
Journal title
Journal of Systems Science and Complexity
Issue number
2
Volume number
39
Pages (from-to)
831-856
Downloads counter
4
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

No-regret learning has been widely used to compute a Nash equilibrium in two-person zero-sum games. However, there is still a lack of regret analysis for network stochastic zero-sum games, where players competing in two subnetworks only have access to some local information, and the cost functions are subject to stochastic uncertainty. Such a game model can be found in network interdiction problems, when a group of inspectors work together to detect a group of evaders. In this paper, the authors propose a distributed stochastic mirror descent (D-SMD) method, and establish the regret bounds O(T) and O(log T) in the expected sense for convex-concave and strongly convex-strongly concave costs, respectively. The proposed bounds match those of the best known first-order online optimization algorithms. The authors then prove the convergence of the time-averaged iterates of D-SMD to the set of Nash equilibria. Finally, the authors show that the actual iterates of D-SMD almost surely converge to the Nash equilibrium in the strictly convex-strictly concave setting.

Files

S11424-026-5503-2.pdf
(pdf | 0.84 Mb)
Taverne
warning

File under embargo until 28-10-2026