「cs.CV」カテゴリーアーカイブ

Equivariant Image Modeling

投稿日: 2025年3月25日作成者: jarxiv

要約自己回帰や拡散アプローチなどの現在の生成モデルは、高次元データ分布学習を一 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Target-Aware Video Diffusion Models

投稿日: 2025年3月25日作成者: jarxiv

要約ターゲットが認識しているビデオ拡散モデルを提示します。これは、俳優が目的の … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Depth Matters: Multimodal RGB-D Perception for Robust Autonomous Agents

投稿日: 2025年3月24日作成者: jarxiv

要約リアルタイムの制御決定を行うために純粋に認識に依存する自律エージェントは、 … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

投稿日: 2025年3月24日作成者: jarxiv

要約言語の統合と3D認識は、物理的な世界を理解し、相互作用する具体化されたエー … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

SGFormer: Satellite-Ground Fusion for 3D Semantic Scene Completion

投稿日: 2025年3月24日作成者: jarxiv

要約最近、カメラベースのソリューションがシーンセマンティック完了（SSC）につ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

An Integrated Approach to Robotic Object Grasping and Manipulation

投稿日: 2025年3月24日作成者: jarxiv

要約倉庫運用の肉体労働と効率性の増大する課題に対応して、Amazonは、さまざ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

HOTFormerLoc: Hierarchical Octree Transformer for Versatile Lidar Place Recognition Across Ground and Aerial Views

投稿日: 2025年3月24日作成者: jarxiv

要約都市部と森林環境全体の地上から地面から地上から天の両方のシナリオで、大規模 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Efficient Training of Generalizable Visuomotor Policies via Control-Aware Augmentation

投稿日: 2025年3月24日作成者: jarxiv

要約一般化の改善は、具体化されたAIの重要な課題の1つです。この場合、多様なシ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

HAPI: A Model for Learning Robot Facial Expressions from Human Preferences

投稿日: 2025年3月24日作成者: jarxiv

要約固定された関節構成に基づいた手作りの方法が硬く不自然な行動をもたらすことが … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.HC, cs.LG, cs.RO | コメントを受け付けていません

GAA-TSO: Geometry-Aware Assisted Depth Completion for Transparent and Specular Objects

投稿日: 2025年3月24日作成者: jarxiv

要約透明で鏡面のオブジェクトは、日常生活、工場、研究所で頻繁に遭遇します。た … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Equivariant Image Modeling

Target-Aware Video Diffusion Models

Depth Matters: Multimodal RGB-D Perception for Robust Autonomous Agents

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

SGFormer: Satellite-Ground Fusion for 3D Semantic Scene Completion

An Integrated Approach to Robotic Object Grasping and Manipulation

HOTFormerLoc: Hierarchical Octree Transformer for Versatile Lidar Place Recognition Across Ground and Aerial Views

Efficient Training of Generalizable Visuomotor Policies via Control-Aware Augmentation

HAPI: A Model for Learning Robot Facial Expressions from Human Preferences

GAA-TSO: Geometry-Aware Assisted Depth Completion for Transparent and Specular Objects

最近の投稿

最近のコメント

アーカイブ

カテゴリー