「cs.CV」カテゴリーアーカイブ

Flex3D: Feed-Forward 3D Generation with Flexible Reconstruction Model and Input View Curation

投稿日: 2025年6月3日作成者: jarxiv

要約テキスト、単一の画像、またはスパースビュー画像から高品質の3Dコンテンツを … 続きを読む →

カテゴリー: cs.CV, cs.GR, eess.IV | コメントを受け付けていません

MIRAGE: Assessing Hallucination in Multimodal Reasoning Chains of MLLM

投稿日: 2025年6月3日作成者: jarxiv

要約マルチモーダル大手言語モデル（MLLMS）のマルチモーダル幻覚は、MLLM … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models

投稿日: 2025年6月3日作成者: jarxiv

要約医学的視覚言語モデルは、しばしば放射線レポートで正確な定量的測定を生成する … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Urban Safety Perception Assessments via Integrating Multimodal Large Language Models with Street View Images

投稿日: 2025年6月3日作成者: jarxiv

要約都市の安全性の認識を測定することは、伝統的に人的資源に大きく依存している重 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Distractor-free Generalizable 3D Gaussian Splatting

投稿日: 2025年6月3日作成者: jarxiv

要約以前に未開拓の課題に対処する新しいフレームワークであるDGGSを紹介します … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video

投稿日: 2025年6月3日作成者: jarxiv

要約堅牢なツールと公開されている事前に訓練されたモデルは、言語モデルの機械的解 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models

投稿日: 2025年6月3日作成者: jarxiv

要約 Vision-Language Generative Reward Mod … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection

投稿日: 2025年6月3日作成者: jarxiv

要約大規模な言語モデル（LLMS）は、放射線レポート生成を含むさまざまなドメイ … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

A Survey on Event-driven 3D Reconstruction: Development under Different Categories

投稿日: 2025年6月3日作成者: jarxiv

要約イベントカメラは、時間分解能が高い、遅延が低く、ダイナミックレンジが高いた … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.GR | コメントを受け付けていません

RePaViT: Scalable Vision Transformer Acceleration via Structural Reparameterization on Feedforward Network Layers

投稿日: 2025年6月3日作成者: jarxiv

要約注意層ではなく、Feedforwardネットワーク（FFN）レイヤーがVi … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Flex3D: Feed-Forward 3D Generation with Flexible Reconstruction Model and Input View Curation

MIRAGE: Assessing Hallucination in Multimodal Reasoning Chains of MLLM

FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models

Urban Safety Perception Assessments via Integrating Multimodal Large Language Models with Street View Images

Distractor-free Generalizable 3D Gaussian Splatting

Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video

VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models

RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection

A Survey on Event-driven 3D Reconstruction: Development under Different Categories

RePaViT: Scalable Vision Transformer Acceleration via Structural Reparameterization on Feedforward Network Layers

最近の投稿

最近のコメント

アーカイブ

カテゴリー