「cs.CV」カテゴリーアーカイブ

3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results

投稿日: 2025年1月20日作成者: jarxiv

要約 2025 年海洋コンピュータビジョン (MaCVi) に関する第 3 回 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Zero-Shot Monocular Scene Flow Estimation in the Wild

投稿日: 2025年1月20日作成者: jarxiv

要約大規模なモデルは、深度推定などの多くの低レベル視覚タスクについてデータセッ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

FaceXBench: Evaluating Multimodal LLMs on Face Understanding

投稿日: 2025年1月20日作成者: jarxiv

要約マルチモーダル大規模言語モデル (MLLM) は、幅広いタスクやドメインに … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Mesh2SLAM in VR: A Fast Geometry-Based SLAM Framework for Rapid Prototyping in Virtual Reality Applications

投稿日: 2025年1月20日作成者: jarxiv

要約 SLAM は、ロボット工学や AR/VR に幅広く応用できる基礎的な技術で … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Instruction-Guided Fusion of Multi-Layer Visual Features in Large Vision-Language Models

投稿日: 2025年1月20日作成者: jarxiv

要約大規模ビジョン言語モデル (LVLM) は、事前トレーニングされたビジョン … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

MECD+: Unlocking Event-Level Causal Graph Discovery for Video Reasoning

投稿日: 2025年1月20日作成者: jarxiv

要約ビデオ因果推論は、因果関係の観点からビデオを高度に理解することを目的として … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Embodied Scene Understanding for Vision Language Models via MetaVQA

投稿日: 2025年1月17日作成者: jarxiv

要約ビジョン言語モデル (VLM) は、さまざまなモビリティアプリケーション … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Unified Few-shot Crack Segmentation and its Precise 3D Automatic Measurement in Concrete Structures

投稿日: 2025年1月17日作成者: jarxiv

要約視覚空間システムは、コンクリートのひび割れ検査においてますます不可欠になっ … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Are Open-Vocabulary Models Ready for Detection of MEP Elements on Construction Sites

投稿日: 2025年1月17日作成者: jarxiv

要約建設業界は長い間ロボット工学とコンピュータービジョンを研究してきましたが、 … 続きを読む →

カテゴリー: cs.CV, cs.RO | コメントを受け付けていません

Efficient Few-Shot Medical Image Analysis via Hierarchical Contrastive Vision-Language Learning

投稿日: 2025年1月17日作成者: jarxiv

要約医用画像分類における少数ショット学習には、利用可能な注釈付きデータの制限と … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results

Zero-Shot Monocular Scene Flow Estimation in the Wild

FaceXBench: Evaluating Multimodal LLMs on Face Understanding

Mesh2SLAM in VR: A Fast Geometry-Based SLAM Framework for Rapid Prototyping in Virtual Reality Applications

Instruction-Guided Fusion of Multi-Layer Visual Features in Large Vision-Language Models

MECD+: Unlocking Event-Level Causal Graph Discovery for Video Reasoning

Embodied Scene Understanding for Vision Language Models via MetaVQA

Unified Few-shot Crack Segmentation and its Precise 3D Automatic Measurement in Concrete Structures

Are Open-Vocabulary Models Ready for Detection of MEP Elements on Construction Sites

Efficient Few-Shot Medical Image Analysis via Hierarchical Contrastive Vision-Language Learning

最近の投稿

最近のコメント

アーカイブ

カテゴリー