「cs.CV」カテゴリーアーカイブ

PERSE: Personalized 3D Generative Avatars from A Single Portrait

投稿日: 2024年12月31日作成者: jarxiv

要約参照ポートレートからアニメーション化可能な個人化された生成アバターを構築す … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

ViPCap: Retrieval Text-Based Visual Prompts for Lightweight Image Captioning

投稿日: 2024年12月31日作成者: jarxiv

要約取得したデータを使用した最近の軽量画像キャプションモデルは、主にテキスト … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models

投稿日: 2024年12月31日作成者: jarxiv

要約ゼロショットのカスタマイズされたビデオ生成は、その大きな応用可能性により大 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

MVTamperBench: Evaluating Robustness of Vision-Language Models

投稿日: 2024年12月31日作成者: jarxiv

要約視覚言語モデル (VLM) の最近の進歩により、複雑なビデオ理解タスクが大 … 続きを読む →

カテゴリー: 68Q32, 68Q85, 68T05, 68T37, 68T40, 68T45, 94A08, cs.CV, I.2.10 | コメントを受け付けていません

Planning from Imagination: Episodic Simulation and Episodic Memory for Vision-and-Language Navigation

投稿日: 2024年12月30日作成者: jarxiv

要約人間は、エピソードシミュレーションとエピソード記憶を使用して、なじみのな … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

A Light-Weight Framework for Open-Set Object Detection with Decoupled Feature Alignment in Joint Space

投稿日: 2024年12月30日作成者: jarxiv

要約オープンセットオブジェクト検出 (OSOD) は、非構造化環境でのロボット … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

EC-Diffuser: Multi-Object Manipulation via Entity-Centric Behavior Generation

投稿日: 2024年12月30日作成者: jarxiv

要約オブジェクトの操作は日常業務の一般的な要素ですが、高次元の観察からオブジェ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

PhotoBot: Reference-Guided Interactive Photography via Natural Language

投稿日: 2024年12月30日作成者: jarxiv

要約人間の高度な言語ガイダンスとロボット写真家の間の相互作用に基づいて、完全に … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

Learning Monocular Depth from Events via Egomotion Compensation

投稿日: 2024年12月30日作成者: jarxiv

要約イベントカメラは、明るさの変化をまばらかつ非同期に報告する神経形態にヒン … 続きを読む →

カテゴリー: cs.CV, cs.LG, cs.RO | コメントを受け付けていません

CSCPR: Cross-Source-Context Indoor RGB-D Place Recognition

投稿日: 2024年12月30日作成者: jarxiv

要約以前の研究である PoCo を拡張し、グローバルな検索と再ランキングをエン … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

PERSE: Personalized 3D Generative Avatars from A Single Portrait

ViPCap: Retrieval Text-Based Visual Prompts for Lightweight Image Captioning

VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models

MVTamperBench: Evaluating Robustness of Vision-Language Models

Planning from Imagination: Episodic Simulation and Episodic Memory for Vision-and-Language Navigation

A Light-Weight Framework for Open-Set Object Detection with Decoupled Feature Alignment in Joint Space

EC-Diffuser: Multi-Object Manipulation via Entity-Centric Behavior Generation

PhotoBot: Reference-Guided Interactive Photography via Natural Language

Learning Monocular Depth from Events via Egomotion Compensation

CSCPR: Cross-Source-Context Indoor RGB-D Place Recognition

最近の投稿

最近のコメント

アーカイブ

カテゴリー