「cs.GT」カテゴリーアーカイブ

On Separation Between Best-Iterate, Random-Iterate, and Last-Iterate Convergence of Learning in Games

投稿日: 2025年3月5日作成者: jarxiv

要約ゲームにおける学習ダイナミクスの非エルゴディック収束は、理論と実践の両方に … 続きを読む →

カテゴリー: cs.GT, cs.LG, math.OC | コメントを受け付けていません

Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

投稿日: 2025年3月5日作成者: jarxiv

要約後悔の$^+$（rm $^+$）に基づいて、2プレイヤーゼロサムゲームを解 … 続きを読む →

カテゴリー: cs.GT, cs.LG | コメントを受け付けていません

Anytime-Constrained Equilibria in Polynomial Time

投稿日: 2025年3月5日作成者: jarxiv

要約いつでも制約をマルコフゲームの設定と、いつでも制約のある平衡（ACE）の対 … 続きを読む →

カテゴリー: cs.AI, cs.DS, cs.GT, cs.LG | コメントを受け付けていません

Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

投稿日: 2025年3月4日作成者: jarxiv

要約人間のフィードバックによる強化学習（RLHF）は、大規模な言語モデル（LL … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.GT, cs.LG | コメントを受け付けていません

Re-evaluating Open-ended Evaluation of Large Language Models

投稿日: 2025年2月28日作成者: jarxiv

要約評価は、伝統的に特定のスキルの候補者のランキングに焦点を当ててきました。 … 続きを読む →

カテゴリー: cs.CL, cs.GT, cs.LG, stat.ML | コメントを受け付けていません

Mixing Any Cocktail with Limited Ingredients: On the Structure of Payoff Sets in Multi-Objective MDPs and its Impact on Randomised Strategies

投稿日: 2025年2月26日作成者: jarxiv

要約マルコフの決定プロセスにおける多次元ペイオフ関数を検討し、特定の予想ペイオ … 続きを読む →

カテゴリー: cs.AI, cs.FL, cs.GT, cs.LO, math.PR | コメントを受け付けていません

Adversaries With Incentives: A Strategic Alternative to Adversarial Robustness

投稿日: 2025年2月25日作成者: jarxiv

要約敵対的な訓練は、 *敵対者 *を防御することを目的としています。その唯一の … 続きを読む →

カテゴリー: cs.GT, cs.LG | コメントを受け付けていません

An Adversarial Analysis of Thompson Sampling for Full-information Online Learning: from Finite to Infinite Action Spaces

投稿日: 2025年2月24日作成者: jarxiv

要約専門家の空間ではなく、敵の将来の行動の空間で学習者の事前が定義されている場 … 続きを読む →

カテゴリー: cs.GT, cs.LG, math.ST, stat.ML, stat.TH | コメントを受け付けていません

An Adversarial Analysis of Thompson Sampling for Full-information Online Learning: from Finite to Infinite Action Spaces

投稿日: 2025年2月21日作成者: jarxiv

要約専門家の空間ではなく、敵の将来の行動の空間で学習者の事前が定義されている場 … 続きを読む →

カテゴリー: cs.GT, cs.LG, math.ST, stat.ML, stat.TH | コメントを受け付けていません

Human Misperception of Generative-AI Alignment: A Laboratory Experiment

投稿日: 2025年2月21日作成者: jarxiv

要約私たちは、経済的意思決定の文脈において、生成的人工知能（GENAI）のアラ … 続きを読む →

カテゴリー: cs.AI, cs.GT, econ.TH | コメントを受け付けていません

「cs.GT」カテゴリーアーカイブ

On Separation Between Best-Iterate, Random-Iterate, and Last-Iterate Convergence of Learning in Games

Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

Anytime-Constrained Equilibria in Polynomial Time

Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

Re-evaluating Open-ended Evaluation of Large Language Models

Mixing Any Cocktail with Limited Ingredients: On the Structure of Payoff Sets in Multi-Objective MDPs and its Impact on Randomised Strategies

Adversaries With Incentives: A Strategic Alternative to Adversarial Robustness

An Adversarial Analysis of Thompson Sampling for Full-information Online Learning: from Finite to Infinite Action Spaces

An Adversarial Analysis of Thompson Sampling for Full-information Online Learning: from Finite to Infinite Action Spaces

Human Misperception of Generative-AI Alignment: A Laboratory Experiment

最近の投稿

最近のコメント

アーカイブ

カテゴリー