月別アーカイブ: 2025年4月

SlowFastVAD: Video Anomaly Detection via Integrating Simple Detector and RAG-Enhanced Vision-Language Model

投稿日: 2025年4月15日作成者: jarxiv

要約ビデオアノマリー検出（VAD）は、ビデオで予期しないイベントを特定すること … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

InstructEngine: Instruction-driven Text-to-Image Alignment

投稿日: 2025年4月15日作成者: jarxiv

要約補強材/AIフィードバック（RLHF/RLAIF）からの学習は、テキストか … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

HOMER: Homography-Based Efficient Multi-view 3D Object Removal

投稿日: 2025年4月15日作成者: jarxiv

要約 3Dオブジェクトの削除は、3Dシーンの編集で重要なサブタスクであり、シーン … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

LL-Gaussian: Low-Light Scene Reconstruction and Enhancement via Gaussian Splatting for Novel View Synthesis

投稿日: 2025年4月15日作成者: jarxiv

要約低光光シーンでの新規ビュー合成（NVS）は、重度のノイズ、低ダイナミックレ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

PSGait: Gait Recognition using Parsing Skeleton

投稿日: 2025年4月15日作成者: jarxiv

要約歩行認識は、その非侵入性と閉塞への回復力のために、堅牢な生体認証モダリティ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Benchmarking 3D Human Pose Estimation Models Under Occlusions

投稿日: 2025年4月15日作成者: jarxiv

要約このペーパーでは、閉塞、カメラの位置、および作用の変動に対する既存のモデル … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Multimodal Representation Learning Techniques for Comprehensive Facial State Analysis

投稿日: 2025年4月15日作成者: jarxiv

要約マルチモーダルファンデーションモデルは、複数のモダリティから情報を統合する … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion

投稿日: 2025年4月15日作成者: jarxiv

要約スマートフォンなどのコモディティデバイスが撮影した単眼動画から、アニメーシ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR | コメントを受け付けていません

ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements

投稿日: 2025年4月15日作成者: jarxiv

要約基礎ビジョン言語モデル（VLM）の最近の進歩により、コンピュータービジョン … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Patch and Shuffle: A Preprocessing Technique for Texture Classification in Autonomous Cementitious Fabrication

投稿日: 2025年4月15日作成者: jarxiv

要約自律的な製造システムは、建設と製造を変革していますが、印刷エラーに対して脆 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2025年4月

SlowFastVAD: Video Anomaly Detection via Integrating Simple Detector and RAG-Enhanced Vision-Language Model

InstructEngine: Instruction-driven Text-to-Image Alignment

HOMER: Homography-Based Efficient Multi-view 3D Object Removal

LL-Gaussian: Low-Light Scene Reconstruction and Enhancement via Gaussian Splatting for Novel View Synthesis

PSGait: Gait Recognition using Parsing Skeleton

Benchmarking 3D Human Pose Estimation Models Under Occlusions

Multimodal Representation Learning Techniques for Comprehensive Facial State Analysis

GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion

ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements

Patch and Shuffle: A Preprocessing Technique for Texture Classification in Autonomous Cementitious Fabrication

最近の投稿

最近のコメント

アーカイブ

カテゴリー