月別アーカイブ: 2025年4月

Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation

投稿日: 2025年4月23日作成者: jarxiv

要約視覚言語モデル（VLMS）の最近の進歩により、複雑なグラフィカルユーザーイ … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models

投稿日: 2025年4月23日作成者: jarxiv

要約 Phybenchを紹介します。Phybenchは、物理的なコンテキストで大 … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks

投稿日: 2025年4月23日作成者: jarxiv

要約大規模な言語モデル（LLM）は、言語エージェントが簡単なタスクに取り組むこ … 続きを読む →

カテゴリー: cs.CL | コメントを受け付けていません

TTRL: Test-Time Reinforcement Learning

投稿日: 2025年4月23日作成者: jarxiv

要約このホワイトペーパーでは、大規模な言語モデル（LLM）のタスクを推論するた … 続きを読む →

カテゴリー: cs.CL, cs.LG | コメントを受け付けていません

Facilitating Reinforcement Learning for Process Control Using Transfer Learning: Overview and Perspectives

投稿日: 2025年4月23日作成者: jarxiv

要約 Industry 4.0とSmart Manufacturingのコンテキ … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.SY, eess.SY | コメントを受け付けていません

Bidirectional Task-Motion Planning Based on Hierarchical Reinforcement Learning for Strategic Confrontation

投稿日: 2025年4月23日作成者: jarxiv

要約 Swarm Roboticsでは、戦略的対立を含む対立シナリオには、個別の … 続きを読む →

カテゴリー: cs.AI, cs.RO | コメントを受け付けていません

Time’s Up! An Empirical Study of LLM Reasoning Ability Under Output Length Constraint

投稿日: 2025年4月23日作成者: jarxiv

要約最近の研究により、テスト時間スケーリングにおける大規模な言語モデル（LLM … 続きを読む →

カテゴリー: cs.AI | コメントを受け付けていません

Supporting Data-Frame Dynamics in AI-assisted Decision Making

投稿日: 2025年4月23日作成者: jarxiv

要約ハイステークスの意思決定には、進化する証拠とシフト仮説との間の継続的な相互 … 続きを読む →

カテゴリー: cs.AI, cs.HC | コメントを受け付けていません

Dynamic Early Exit in Reasoning Models

投稿日: 2025年4月23日作成者: jarxiv

要約大規模な推論言語モデル（LRLMS）の最近の進歩は、テスト時間スケーリング … 続きを読む →

カテゴリー: cs.AI, cs.CL | コメントを受け付けていません

Impact of Noise on LLM-Models Performance in Abstraction and Reasoning Corpus (ARC) Tasks with Model Temperature Considerations

投稿日: 2025年4月23日作成者: jarxiv

要約大規模な言語モデル（LLMS）の最近の進歩により、特に抽象化とパターン認識 … 続きを読む →

カテゴリー: cs.AI | コメントを受け付けていません

月別アーカイブ: 2025年4月

Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation

PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models

Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks

TTRL: Test-Time Reinforcement Learning

Facilitating Reinforcement Learning for Process Control Using Transfer Learning: Overview and Perspectives

Bidirectional Task-Motion Planning Based on Hierarchical Reinforcement Learning for Strategic Confrontation

Time’s Up! An Empirical Study of LLM Reasoning Ability Under Output Length Constraint

Supporting Data-Frame Dynamics in AI-assisted Decision Making

Dynamic Early Exit in Reasoning Models

Impact of Noise on LLM-Models Performance in Abstraction and Reasoning Corpus (ARC) Tasks with Model Temperature Considerations

最近の投稿

最近のコメント

アーカイブ

カテゴリー