「cs.LG」カテゴリーアーカイブ

The Best Instruction-Tuning Data are Those That Fit

投稿日: 2025年2月10日作成者: jarxiv

要約高品質の監視された微調整（SFT）データは、前処理された大手言語モデル（L … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.LG | コメントを受け付けていません

TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint

投稿日: 2025年2月7日作成者: jarxiv

要約モデルベースの計画を組み合わせたモデルベースの強化学習アルゴリズムと、事前 … 続きを読む →

カテゴリー: cs.LG, cs.RO | コメントを受け付けていません

Simultaneous Multi-Robot Motion Planning with Projected Diffusion Models

投稿日: 2025年2月7日作成者: jarxiv

要約拡散モデルの最近の進歩は、ロボット工学に大きな可能性を秘めており、環境の生 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

Control-oriented Clustering of Visual Latent Representation

投稿日: 2025年2月7日作成者: jarxiv

要約視覚表現空間のジオメトリ（Visionエンコーダーからアクションデコーダー … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Fast Ergodic Search with Kernel Functions

投稿日: 2025年2月7日作成者: jarxiv

要約 Ergodic Searchにより、検索スペースの漸近カバレッジを保証しな … 続きを読む →

カテゴリー: cs.LG, cs.RO | コメントを受け付けていません

How vulnerable is my policy? Adversarial attacks on modern behavior cloning policies

投稿日: 2025年2月7日作成者: jarxiv

要約デモンストレーション（LFD）アルゴリズムから学ぶことで、ロボット操作タス … 続きを読む →

カテゴリー: cs.CR, cs.LG, cs.RO | コメントを受け付けていません

MimicTouch: Leveraging Multi-modal Human Tactile Demonstrations for Contact-rich Manipulation

投稿日: 2025年2月7日作成者: jarxiv

要約触覚センシングは、挿入やアセンブリなどの細粒の接触豊富な操作タスクにとって … 続きを読む →

カテゴリー: cs.LG, cs.RO | コメントを受け付けていません

RLPP: A Residual Method for Zero-Shot Real-World Autonomous Racing on Scaled Platforms

投稿日: 2025年2月7日作成者: jarxiv

要約自律的なレースは、動的な条件下で迅速な決定を下すことができる堅牢なコントロ … 続きを読む →

カテゴリー: 68T40, cs.LG, cs.RO | コメントを受け付けていません

M$^3$PC: Test-time Model Predictive Control for Pretrained Masked Trajectory Model

投稿日: 2025年2月7日作成者: jarxiv

要約オフライン強化学習（RL）における最近の研究は、マスクされた自動エンコード … 続きを読む →

カテゴリー: cs.LG, cs.RO, cs.SY, eess.SY | コメントを受け付けていません

Scenario-Based Curriculum Generation for Multi-Agent Autonomous Driving

投稿日: 2025年2月7日作成者: jarxiv

要約多様で複雑なトレーニングシナリオの自動生成は、多くの複雑な学習タスクにおい … 続きを読む →

カテゴリー: cs.LG, cs.MA, cs.RO | コメントを受け付けていません

「cs.LG」カテゴリーアーカイブ

The Best Instruction-Tuning Data are Those That Fit

TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint

Simultaneous Multi-Robot Motion Planning with Projected Diffusion Models

Control-oriented Clustering of Visual Latent Representation

Fast Ergodic Search with Kernel Functions

How vulnerable is my policy? Adversarial attacks on modern behavior cloning policies

MimicTouch: Leveraging Multi-modal Human Tactile Demonstrations for Contact-rich Manipulation

RLPP: A Residual Method for Zero-Shot Real-World Autonomous Racing on Scaled Platforms

M$^3$PC: Test-time Model Predictive Control for Pretrained Masked Trajectory Model

Scenario-Based Curriculum Generation for Multi-Agent Autonomous Driving

最近の投稿

最近のコメント

アーカイブ

カテゴリー