「cs.CV」カテゴリーアーカイブ

StarVid: Enhancing Semantic Alignment in Video Diffusion Models via Spatial and SynTactic Guided Attention Refocusing

投稿日: 2025年3月4日作成者: jarxiv

要約拡散モデルを用いたテキスト映像（T2V）生成における最近の進歩は、大きな注 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Self-Supervised Iterative Refinement for Anomaly Detection in Industrial Quality Control

投稿日: 2025年3月4日作成者: jarxiv

要約本研究では、ロバストな異常検出手法である反復的精密化プロセス（IRP）を紹 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation

投稿日: 2025年3月4日作成者: jarxiv

要約レシピデータを用いた食品イメージの理解に関する研究は、そのデータの多様性と … 続きを読む →

カテゴリー: cs.CV, cs.MM | コメントを受け付けていません

ModeDreamer: Mode Guiding Score Distillation for Text-to-3D Generation using Reference Image Prompts

投稿日: 2025年3月4日作成者: jarxiv

要約既存のスコアディスティレーションサンプリング（SDS）ベースの手法は、テキ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition

投稿日: 2025年3月4日作成者: jarxiv

要約捕獲された行動は個体群の健康状態の変化を示す最も早い指標となるため、カメラ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models

投稿日: 2025年3月4日作成者: jarxiv

要約物体中心（OC）表現は、視覚シーンを離散的な物体の構成としてモデル化するも … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Foundation Models — A Panacea for Artificial Intelligence in Pathology?

投稿日: 2025年3月4日作成者: jarxiv

要約病理診断における人工知能（AI）の役割は、診断の補助から、全スライド画像（ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing

投稿日: 2025年3月4日作成者: jarxiv

要約拡散に基づく画像生成は大きく進歩したが、被写体駆動型生成と指示に基づく編集 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

FlexDrive: Toward Trajectory Flexibility in Driving Scene Reconstruction and Rendering

投稿日: 2025年3月4日作成者: jarxiv

要約ドライビングシーンの再構成とレンダリングは、3Dガウススプラッティングを用 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning

投稿日: 2025年3月4日作成者: jarxiv

要約多インスタンス学習(Multi-Instance Learning: MI … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

StarVid: Enhancing Semantic Alignment in Video Diffusion Models via Spatial and SynTactic Guided Attention Refocusing

Self-Supervised Iterative Refinement for Anomaly Detection in Industrial Quality Control

FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation

ModeDreamer: Mode Guiding Score Distillation for Text-to-3D Generation using Reference Image Prompts

The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition

Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models

Foundation Models — A Panacea for Artificial Intelligence in Pathology?

MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing

FlexDrive: Toward Trajectory Flexibility in Driving Scene Reconstruction and Rendering

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning

最近の投稿

最近のコメント

アーカイブ

カテゴリー