「cs.LG」カテゴリーアーカイブ

THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

投稿日: 2025年4月4日作成者: jarxiv

要約大規模視覚言語モデル（LVLM）における幻覚の軽減は、依然として未解決の問 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

RoboAct-CLIP: Video-Driven Pre-training of Atomic Action Understanding for Robotics

投稿日: 2025年4月4日作成者: jarxiv

要約視覚言語モデル(VLM)は、マルチモーダル知覚と意味論的推論を通じて、タス … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

Adapting World Models with Latent-State Dynamics Residuals

投稿日: 2025年4月4日作成者: jarxiv

要約シミュレーションから現実への強化学習(RL)は、シミュレーションと現実世界 … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

R+X: Retrieval and Execution from Everyday Human Videos

投稿日: 2025年4月4日作成者: jarxiv

要約我々は、ロボットが日常的なタスクを実行している、ラベル付けされていない、一 … 続きを読む →

カテゴリー: cs.LG, cs.RO | コメントを受け付けていません

CHARMS: Cognitive Hierarchical Agent with Reasoning and Motion Styles

投稿日: 2025年4月4日作成者: jarxiv

要約自律走行シミュレーションシナリオにおける低知能と単純化された車両挙動モデリ … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets

投稿日: 2025年4月4日作成者: jarxiv

要約模倣学習は、汎用のロボットを構築するための有望なアプローチとして浮上してき … 続きを読む →

カテゴリー: cs.AI, cs.LG, cs.RO | コメントを受け付けていません

Accelerating IoV Intrusion Detection: Benchmarking GPU-Accelerated vs CPU-Based ML Libraries

投稿日: 2025年4月4日作成者: jarxiv

要約自動車のインターネット（IoV）は、高度な侵入検知システムを必要とする可能 … 続きを読む →

カテゴリー: cs.AI, cs.CR, cs.LG | コメントを受け付けていません

Beyond Non-Expert Demonstrations: Outcome-Driven Action Constraint for Offline Reinforcement Learning

投稿日: 2025年4月4日作成者: jarxiv

要約我々は、現実的なデータ、特に最適でない行動方針によって収集された非専門家デ … 続きを読む →

カテゴリー: cs.LG, cs.RO | コメントを受け付けていません

Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control

投稿日: 2025年4月3日作成者: jarxiv

要約セグメンテーション、深さ、エッジなどのさまざまなモダリティの複数の空間制御 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Emotion estimation from video footage with LSTM

投稿日: 2025年4月3日作成者: jarxiv

要約一般的な感情の推定は、長い間研究されてきた分野であり、機械学習を使用してい … 続きを読む →

カテゴリー: (Primary), 68T40, cs.CV, cs.LG, cs.RO, I.2.9 | コメントを受け付けていません

「cs.LG」カテゴリーアーカイブ

THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

RoboAct-CLIP: Video-Driven Pre-training of Atomic Action Understanding for Robotics

Adapting World Models with Latent-State Dynamics Residuals

R+X: Retrieval and Execution from Everyday Human Videos

CHARMS: Cognitive Hierarchical Agent with Reasoning and Motion Styles

Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets

Accelerating IoV Intrusion Detection: Benchmarking GPU-Accelerated vs CPU-Based ML Libraries

Beyond Non-Expert Demonstrations: Outcome-Driven Action Constraint for Offline Reinforcement Learning

Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control

Emotion estimation from video footage with LSTM

最近の投稿

最近のコメント

アーカイブ

カテゴリー