「cs.CV」カテゴリーアーカイブ

SurGen: Text-Guided Diffusion Model for Surgical Video Generation

投稿日: 2024年8月27日作成者: jarxiv

要約拡散ベースのビデオ生成モデルは大幅な進歩を遂げ、視覚的な忠実度、時間的一貫 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.LG | コメントを受け付けていません

Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models

投稿日: 2024年8月27日作成者: jarxiv

要約大規模マルチモーダルモデル (LMM) は、視覚言語タスクでは有望である … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

Explaining Vision-Language Similarities in Dual Encoders with Feature-Pair Attributions

投稿日: 2024年8月27日作成者: jarxiv

要約 CLIP モデルのようなデュアルエンコーダアーキテクチャは、2 種類の … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement

投稿日: 2024年8月27日作成者: jarxiv

要約単一画像による人間の再構成に関する既存の研究は、トレーニングデータが不十 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

TC-PDM: Temporally Consistent Patch Diffusion Models for Infrared-to-Visible Video Translation

投稿日: 2024年8月27日作成者: jarxiv

要約赤外線イメージングは、物体の温度を捕捉することで、変化する照明条件に対 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Gallery-Aware Uncertainty Estimation For Open-Set Face Recognition

投稿日: 2024年8月27日作成者: jarxiv

要約画質の正確な推定とモデルの堅牢性の向上は、制約のない顔認識における重要な課 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field Cameras

投稿日: 2024年8月27日作成者: jarxiv

要約豊富な情報を活用することは、高密度の予測タスクにとって非常に重要です。ラ … 続きを読む →

カテゴリー: cs.CV, cs.RO, eess.IV | コメントを受け付けていません

Planner3D: LLM-enhanced graph prior meets 3D indoor scene explicit regularization

投稿日: 2024年8月27日作成者: jarxiv

要約合成 3D シーン合成は、現実世界のマルチオブジェクト環境の複雑さを厳密に … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Cascaded Temporal Updating Network for Efficient Video Super-Resolution

投稿日: 2024年8月27日作成者: jarxiv

要約既存のビデオ超解像度 (VSR) 手法は一般に、再帰伝播ネットワークを採用 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Interpretable Representation Learning of Cardiac MRI via Attribute Regularization

投稿日: 2024年8月27日作成者: jarxiv

要約臨床医が人工知能モデルを理解し、信頼できるようにするには、医療画像処理にお … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

SurGen: Text-Guided Diffusion Model for Surgical Video Generation

Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Explaining Vision-Language Similarities in Dual Encoders with Feature-Pair Attributions

MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement

TC-PDM: Temporally Consistent Patch Diffusion Models for Infrared-to-Visible Video Translation

Gallery-Aware Uncertainty Estimation For Open-Set Face Recognition

LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field Cameras

Planner3D: LLM-enhanced graph prior meets 3D indoor scene explicit regularization

Cascaded Temporal Updating Network for Efficient Video Super-Resolution

Interpretable Representation Learning of Cardiac MRI via Attribute Regularization

最近の投稿

最近のコメント

アーカイブ

カテゴリー