‘Pick-and-Pass’ as a Hat-Trick Class for First-Principle Memory, Generalizability, and Interpretability Benchmarks

要約

クローズドドラフトまたは「ピックアンドパス」は、各ラウンドのプレーヤーが手札からカードまたはその他のプレイ可能な要素を選択し、残りを次のプレーヤーに渡す人気のゲームメカニズムです。
クローズドドラフティングを採用したゲームは、他のプレイヤーの手札の記憶が明示的に計算できるため、記憶とターン順序に関する優れた研究に役立ちます。
この論文では、モデルフリー強化学習アルゴリズムを研究するための第一原理ベンチマークと、「Sushi Go Party!」と呼ばれる人気のクローズドドラフトゲームファミリにおける記憶学習能力の比較を確立し、最先端の結果を生み出します。
途中でこの環境で。
さらに寿司合コンとしても！
は、プレイ中のカードのセットに基づいて密接に関連したゲームのセットとして表現できます。さまざまなカードのセットでトレーニングされた強化学習アルゴリズムの一般化可能性を定量化し、一般化されたパフォーマンスとトレーニングと評価の間の設定された距離の間の重要な傾向を確立します。
ゲーム構成。
最後に、学習したモデルの戦略を解釈するための決定ルールを適合させ、それを人間のプレーヤーのランキングの好みと比較して、直感的な共通ルールと興味深い新しい手を見つけます。

要約(オリジナル)

Closed drafting or ‘pick and pass’ is a popular game mechanic where each round players select a card or other playable element from their hand and pass the rest to the next player. Games employing closed drafting make for great studies on memory and turn order due to their explicitly calculable memory of other players’ hands. In this paper, we establish first-principle benchmarks for studying model-free reinforcement learning algorithms and their comparative ability to learn memory in a popular family of closed drafting games called ‘Sushi Go Party!’, producing state-of-the-art results on this environment along the way. Furthermore, as Sushi Go Party! can be expressed as a set of closely-related games based on the set of cards in play, we quantify the generalizability of reinforcement learning algorithms trained on various sets of cards, establishing key trends between generalized performance and the set distance between the train and evaluation game configurations. Finally, we fit decision rules to interpret the strategy of the learned models and compare them to the ranking preferences of human players, finding intuitive common rules and intriguing new moves.

arxiv情報

著者	Jason Wang,Ryan Rezai
発行日	2023-10-31 17:24:40+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

‘Pick-and-Pass’ as a Hat-Trick Class for First-Principle Memory, Generalizability, and Interpretability Benchmarks

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー