UB Paderborn / Katalog / Suche / Details

Zur Ergebnisliste

Ergebnis 19 von 13651

Near-optimal no-regret learning for correlated equilibria in multi-player general-sum games

Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, 2022, p.736-749

2022

Details

Autor(en) / Beteiligte

Titel

Near-optimal no-regret learning for correlated equilibria in multi-player general-sum games

Ist Teil von

Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, 2022, p.736-749

Ort / Verlag

New York, NY, USA: ACM

Erscheinungsjahr

2022

Link zum Volltext

Quelle

ACM Digital Library

Beschreibungen/Notizen

Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS ‘21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player is O(polylog(T)) after T repetitions of the game. In this paper we extend their result from external regret to internal and swap regret, thereby establishing uncoupled learning dynamics that converge to an approximate correlated equilibrium at the rate of O( T−1 ). This substantially improves over the prior best rate of convergence of O(T−3/4) due to Chen and Peng (NeurIPS ‘20), and it is optimal up to polylogarithmic factors. To obtain these results, we develop new techniques for establishing higher-order smoothness for learning dynamics involving fixed point operations. Specifically, we first establish that the no-internal-regret learning dynamics of Stoltz and Lugosi (Mach Learn ‘05) are equivalently simulated by no-external-regret dynamics on a combinatorial space. This allows us to trade the computation of the stationary distribution on a polynomial-sized Markov chain for a (much more well-behaved) linear transformation on an exponential-sized set, enabling us to leverage similar techniques as DGF to near-optimally bound the internal regret. Moreover, we establish an O(polylog(T)) no-swap-regret bound for the classic algorithm of Blum and Mansour (BM) (JMLR ‘07). We do so by introducing a technique based on the Cauchy Integral Formula that circumvents the more limited combinatorial arguments of DFG. In addition to shedding clarity on the near-optimal regret guarantees of BM, our arguments provide insights into the various ways in which the techniques by DFG can be extended and leveraged in the analysis of more involved learning algorithms.

Sprache: Englisch
Identifikatoren: ISBN: 9781450392648, 1450392644
DOI: 10.1145/3519935.3520031
Titel-ID: cdi_acm_books_10_1145_3519935_3520031

Format: –
Schlagworte: Theory of computation -- Convergence and learning in games

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX

Menü

Weitere Dienste

Einstellungen

Near-optimal no-regret learning for correlated equilibria in multi-player general-sum games

Details

Weiterführende Literatur