投稿者「jarxiv」のアーカイブ

AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

投稿日: 2025年5月29日作成者: jarxiv

要約推論対応の大規模な言語モデル（LLMS）は、複雑な推論タスクで強力なパフォ … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

How Do LLMs Perform Two-Hop Reasoning in Context?

投稿日: 2025年5月29日作成者: jarxiv

要約「ソクラテスは人間です。すべての人間は致命的です。したがって、ソクラテ … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Human-Centered Human-AI Collaboration (HCHAC)

投稿日: 2025年5月29日作成者: jarxiv

要約インテリジェントな時代において、人間とインテリジェントシステムとの相互作用 … 続きを読む →

カテゴリー: cs.AI, cs.CY, cs.HC | コメントを受け付けていません

Position: Don’t Use the CLT in LLM Evals With Fewer Than a Few Hundred Datapoints

投稿日: 2025年5月29日作成者: jarxiv

要約有効なエラーバーや有意性テストを含む、大規模な言語モデル（LLM）の厳密な … 続きを読む →

カテゴリー: cs.AI, cs.LG, stat.ML | コメントを受け付けていません

Learned Collusion

投稿日: 2025年5月29日作成者: jarxiv

要約 Qラーニングは、利用可能な各アクションに関連付けられた継続値の推定値（Q値 … 続きを読む →

カテゴリー: cs.AI, cs.GT, econ.TH | コメントを受け付けていません

On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling

投稿日: 2025年5月29日作成者: jarxiv

要約大規模なビジョンモデルと言語モデルをトレーニングするための主要なパラダイム … 続きを読む →

カテゴリー: cs.AI, cs.LG, stat.ML | コメントを受け付けていません

Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation

投稿日: 2025年5月29日作成者: jarxiv

要約このホワイトペーパーでは、重要性サンプリングの行動ポリシーの推定に焦点を当 … 続きを読む →

カテゴリー: cs.AI, cs.LG, stat.ML | コメントを受け付けていません

Novelty Detection in Reinforcement Learning with World Models

投稿日: 2025年5月29日作成者: jarxiv

要約世界モデルを使用した補強学習（RL）は、最近の大幅な成功を発見しています。 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SY, eess.SY | コメントを受け付けていません

From Strangers to Assistants: Fast Desire Alignment for Embodied Agent-User Adaptation

投稿日: 2025年5月29日作成者: jarxiv

要約具体化されたエージェントは複雑な物理的タスクの実行に大きな進歩を遂げていま … 続きを読む →

カテゴリー: cs.AI, cs.MA, cs.RO | コメントを受け付けていません

Overcoming the Machine Penalty with Imperfectly Fair AI Agents

投稿日: 2025年5月29日作成者: jarxiv

要約急速な技術の進歩にもかかわらず、効果的な人間マシンの協力は依然として大きな … 続きを読む →

カテゴリー: cs.AI, cs.GT, cs.HC, econ.GN, q-fin.EC | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

How Do LLMs Perform Two-Hop Reasoning in Context?

Human-Centered Human-AI Collaboration (HCHAC)

Position: Don’t Use the CLT in LLM Evals With Fewer Than a Few Hundred Datapoints

Learned Collusion

On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling

Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation

Novelty Detection in Reinforcement Learning with World Models

From Strangers to Assistants: Fast Desire Alignment for Embodied Agent-User Adaptation

Overcoming the Machine Penalty with Imperfectly Fair AI Agents

最近の投稿

最近のコメント

アーカイブ

カテゴリー