On Separation Between Best-Iterate, Random-Iterate, and Last-Iterate Convergence of Learning in Games

要約

ゲームにおける学習ダイナミクスの非エルゴディック収束は、理論と実践の両方においてその重要性のために最近広く研究されています。
最近の研究（Cai et al。、2024）は、楽観的な多重重量アップデート（OMWU）を含む幅広い学習ダイナミクスが、最後の$ 2 \ Times 2 $マトリックスゲームでも任意の遅い最後の収束を示すことができることを示しました。
ただし、これらのアルゴリズムが、最適な収束など、より弱い基準の下で高速非エルゴードの収束を達成するかどうかは不明のままです。
$ 2 \ Times 2 $ Matrixゲームで、OMWUは同じクラスのゲームでの最終的な収束が遅いこととはまったく対照的に、$ O（T^{-1/6}）$のベストサイド収束率を達成していることを示しています。
さらに、OMWUがすべての繰り返しにわたって予想される二重性ギャップで測定された多項式ランダム型収束率を達成しないことを示す下限を確立します。
この結果は、ランダムな収束は本質的に最良の収束と本質的に同等であるという従来の知恵に挑戦し、前者はしばしば後者を確立するためのプロキシとして使用されます。
私たちの分析は、ダイナミックな後悔との新しいつながりを明らかにし、独立した関心がある可能性のある最良の収束に対する新しい2フェーズアプローチを提示します。

要約(オリジナル)

Non-ergodic convergence of learning dynamics in games is widely studied recently because of its importance in both theory and practice. Recent work (Cai et al., 2024) showed that a broad class of learning dynamics, including Optimistic Multiplicative Weights Update (OMWU), can exhibit arbitrarily slow last-iterate convergence even in simple $2 \times 2$ matrix games, despite many of these dynamics being known to converge asymptotically in the last iterate. It remains unclear, however, whether these algorithms achieve fast non-ergodic convergence under weaker criteria, such as best-iterate convergence. We show that for $2\times 2$ matrix games, OMWU achieves an $O(T^{-1/6})$ best-iterate convergence rate, in stark contrast to its slow last-iterate convergence in the same class of games. Furthermore, we establish a lower bound showing that OMWU does not achieve any polynomial random-iterate convergence rate, measured by the expected duality gaps across all iterates. This result challenges the conventional wisdom that random-iterate convergence is essentially equivalent to best-iterate convergence, with the former often used as a proxy for establishing the latter. Our analysis uncovers a new connection to dynamic regret and presents a novel two-phase approach to best-iterate convergence, which could be of independent interest.

arxiv情報

著者	Yang Cai,Gabriele Farina,Julien Grand-Clément,Christian Kroer,Chung-Wei Lee,Haipeng Luo,Weiqiang Zheng
発行日	2025-03-04 17:49:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

On Separation Between Best-Iterate, Random-Iterate, and Last-Iterate Convergence of Learning in Games

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー