月別アーカイブ: 2024年8月

BoostTrack++: using tracklet information to detect more objects in multiple object tracking

投稿日: 2024年8月26日作成者: jarxiv

要約複数オブジェクト追跡 (MOT) は、真陽性で検出された境界ボックスの選択 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation

投稿日: 2024年8月26日作成者: jarxiv

要約 Stable Diffusion に代表されるテキストガイドによる画像生成 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Collaborative Control for Geometry-Conditioned PBR Image Generation

投稿日: 2024年8月26日作成者: jarxiv

要約グラフィックスパイプラインには物理ベースレンダリング (PBR) マテ … 続きを読む →

カテゴリー: cs.CV, cs.GR, I.4.0 | コメントを受け付けていません

MAML MOT: Multiple Object Tracking based on Meta-Learning

投稿日: 2024年8月26日作成者: jarxiv

要約ビデオ分析技術の進歩に伴い、歩行者が関与する複雑なシーンにおけるマルチオブ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

TokenPacker: Efficient Visual Projector for Multimodal LLM

投稿日: 2024年8月26日作成者: jarxiv

要約ビジュアルプロジェクターは、マルチモーダル LLM (MLLM) におけ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding

投稿日: 2024年8月26日作成者: jarxiv

要約 3D オブジェクトアフォーダンスグランディングは、3D オブジェクト上 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Indoor scene recognition from images under visual corruptions

投稿日: 2024年8月26日作成者: jarxiv

要約屋内シーンの分類は、生活支援のためのインテリジェントロボット工学などのさ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

投稿日: 2024年8月26日作成者: jarxiv

要約 2 本の指グリッパーを使用したロボット操作は、明確な把握可能な機能が欠けて … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models

投稿日: 2024年8月26日作成者: jarxiv

要約既存の車両検出器は通常、事前にトレーニングされたバックボーン (ResNe … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.NE | コメントを受け付けていません

S4D: Streaming 4D Real-World Reconstruction with Gaussians and 3D Control Points

投稿日: 2024年8月26日作成者: jarxiv

要約最近、ガウシアンを使用したダイナミックなシーンの再構成への関心が高まってい … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年8月

BoostTrack++: using tracklet information to detect more objects in multiple object tracking

EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation

Collaborative Control for Geometry-Conditioned PBR Image Generation

MAML MOT: Multiple Object Tracking based on Meta-Learning

TokenPacker: Efficient Visual Projector for Multimodal LLM

Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding

Indoor scene recognition from images under visual corruptions

PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models

S4D: Streaming 4D Real-World Reconstruction with Gaussians and 3D Control Points

最近の投稿

最近のコメント

アーカイブ

カテゴリー