「cs.CV」カテゴリーアーカイブ

Keypoint Detection and Description for Raw Bayer Images

投稿日: 2025年3月12日作成者: jarxiv

要約キーポイント検出とローカル機能の説明は、ロボット認識の基本的なタスクであり … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Language-Depth Navigated Thermal and Visible Image Fusion

投稿日: 2025年3月12日作成者: jarxiv

要約深さ誘導マルチモーダルフュージョンは、可視および赤外線画像から深さ情報を組 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

投稿日: 2025年3月12日作成者: jarxiv

要約拡散ベースの生成モデルは、オブジェクト指向の画像編集に革命をもたらしました … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

GarmentCrafter: Progressive Novel View Synthesis for Single-View 3D Garment Reconstruction and Editing

投稿日: 2025年3月12日作成者: jarxiv

要約 GarmentCrafterを紹介します。これは、非専門的なユーザーがシン … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR | コメントを受け付けていません

CoLMDriver: LLM-based Negotiation Benefits Cooperative Autonomous Driving

投稿日: 2025年3月12日作成者: jarxiv

要約車両から車両への（V2V）協同的自律運転は、単一エージェントシステムに固有 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.MA | コメントを受け付けていません

‘Principal Components’ Enable A New Language of Images

投稿日: 2025年3月12日作成者: jarxiv

要約潜在的なトークン空間に証明可能なPCA様構造を埋め込む新しい視覚トークン化 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models

投稿日: 2025年3月12日作成者: jarxiv

要約統一されたマルチモーダル理解と視覚生成（またはマルチモーダル生成）モデルの … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension

投稿日: 2025年3月12日作成者: jarxiv

要約長いビデオ理解における最近の進歩は、通常、注意分布に基づいて視覚トークン剪 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

AthletePose3D: A Benchmark Dataset for 3D Human Pose Estimation and Kinematic Validation in Athletic Movements

投稿日: 2025年3月12日作成者: jarxiv

要約人間のポーズ推定は、スポーツ科学、リハビリテーション、および生体力学的研究 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

DaD: Distilled Reinforcement Learning for Diverse Keypoint Detection

投稿日: 2025年3月12日作成者: jarxiv

要約キーポイントは、構造からの構造（SFM）システムが数千の画像にスケーリング … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Keypoint Detection and Description for Raw Bayer Images

Language-Depth Navigated Thermal and Visible Image Fusion

OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

GarmentCrafter: Progressive Novel View Synthesis for Single-View 3D Garment Reconstruction and Editing

CoLMDriver: LLM-based Negotiation Benefits Cooperative Autonomous Driving

‘Principal Components’ Enable A New Language of Images

OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models

QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension

AthletePose3D: A Benchmark Dataset for 3D Human Pose Estimation and Kinematic Validation in Athletic Movements

DaD: Distilled Reinforcement Learning for Diverse Keypoint Detection

最近の投稿

最近のコメント

アーカイブ

カテゴリー