投稿者「jarxiv」のアーカイブ

VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

投稿日: 2025年5月29日作成者: jarxiv

要約強化学習Finetuning（RFT）は、長い思考、自己修正、および効果的 … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems

投稿日: 2025年5月29日作成者: jarxiv

要約大規模な言語モデル（LLM）は、具体化されたエージェントのゼロショットプラ … 続きを読む →

カテゴリー: cs.AI | コメントを受け付けていません

Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning

投稿日: 2025年5月29日作成者: jarxiv

要約大規模な言語モデルは、さまざまなドメインで強力なパフォーマンスを示していま … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym

投稿日: 2025年5月29日作成者: jarxiv

要約近年、Openai Gymのようなツールを使用してRehnection L … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.MA | コメントを受け付けていません

On the performance of machine-learning assisted Monte Carlo in sampling from simple statistical physics models

投稿日: 2025年5月29日作成者: jarxiv

要約近年、従来の方法を使用して研究できないサンプルが困難なシステムのシミュレー … 続きを読む →

カテゴリー: cond-mat.dis-nn, cond-mat.stat-mech, cs.AI, cs.LG, physics.comp-ph | コメントを受け付けていません

Machine Unlearning under Overparameterization

投稿日: 2025年5月29日作成者: jarxiv

要約マシンの非学習アルゴリズムは、特定のトレーニングサンプルの影響を削除するこ … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching

投稿日: 2025年5月29日作成者: jarxiv

要約非正常密度またはエネルギー関数からサンプリングする拡散プロセスを学習するた … 続きを読む →

カテゴリー: cs.AI, cs.LG | コメントを受け付けていません

One Rank at a Time: Cascading Error Dynamics in Sequential Learning

投稿日: 2025年5月29日作成者: jarxiv

要約複雑なタスクがよりシンプルで階層的なコンポーネントに分解される順次学習は、 … 続きを読む →

カテゴリー: cs.AI, cs.LG, math.OC | コメントを受け付けていません

Effective and Efficient One-pass Compression of Speech Foundation Models Using Sparsity-aware Self-pinching Gates

投稿日: 2025年5月29日作成者: jarxiv

要約このペーパーでは、モデルの剪定とパラメーターの更新を単一の段階にしっかりと … 続きを読む →

カテゴリー: cs.AI, cs.SD, eess.AS | コメントを受け付けていません

Robust Localization, Mapping, and Navigation for Quadruped Robots

投稿日: 2025年5月29日作成者: jarxiv

要約四足ロボットは現在、強力な補強学習コントローラーと安価で堅牢な商用プラット … 続きを読む →

カテゴリー: cs.AI, cs.RO | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems

Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning

HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym

On the performance of machine-learning assisted Monte Carlo in sampling from simple statistical physics models

Machine Unlearning under Overparameterization

Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching

One Rank at a Time: Cascading Error Dynamics in Sequential Learning

Effective and Efficient One-pass Compression of Speech Foundation Models Using Sparsity-aware Self-pinching Gates

Robust Localization, Mapping, and Navigation for Quadruped Robots

最近の投稿

最近のコメント

アーカイブ

カテゴリー