「cs.CV」カテゴリーアーカイブ

Visual Intention Grounding for Egocentric Assistants

投稿日: 2025年4月21日作成者: jarxiv

要約 Visual Groundingは、テキストの説明を画像内のオブジェクトと … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SupResDiffGAN a new approach for the Super-Resolution task

投稿日: 2025年4月21日作成者: jarxiv

要約この作業では、超解像度タスクの生成的敵対ネットワーク（GANS）と拡散モデ … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

DenSe-AdViT: A novel Vision Transformer for Dense SAR Object Detection

投稿日: 2025年4月21日作成者: jarxiv

要約視覚変圧器（VIT）は、グローバルな特徴を抽出する特別な能力により、合成開 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

AnomalyControl: Learning Cross-modal Semantic Features for Controllable Anomaly Synthesis

投稿日: 2025年4月21日作成者: jarxiv

要約異常合成は、異常検査を進めるための異常なデータを増強するための重要なアプロ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

The Mirage of Performance Gains: Why Contrastive Decoding Fails to Address Multimodal Hallucination

投稿日: 2025年4月21日作成者: jarxiv

要約対照的なデコード戦略は、マルチモーダルの大手言語モデル（MLLM）の幻覚を … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

IReNe: Instant Recoloring of Neural Radiance Fields

投稿日: 2025年4月21日作成者: jarxiv

要約 NERFの進歩により、3Dシーンの再構築と新しいビューの合成が可能になりま … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations

投稿日: 2025年4月21日作成者: jarxiv

要約対照的なインスタンス識別方法は、画像分類やオブジェクト検出などの下流タ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation

投稿日: 2025年4月21日作成者: jarxiv

要約テキストからイメージ（T2I）生成モデルは、近年大幅に進歩しています。た … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Efficient Parameter Adaptation for Multi-Modal Medical Image Segmentation and Prognosis

投稿日: 2025年4月21日作成者: jarxiv

要約がんの検出と予後は、医療イメージング、特にCTとPETスキャンに大きく依存 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Lightweight LiDAR-Camera 3D Dynamic Object Detection and Multi-Class Trajectory Prediction

投稿日: 2025年4月21日作成者: jarxiv

要約多くの場合、サービスモバイルロボットは、タスクを実行しながら動的なオブジェ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.RO | コメントを受け付けていません

「cs.CV」カテゴリーアーカイブ

Visual Intention Grounding for Egocentric Assistants

SupResDiffGAN a new approach for the Super-Resolution task

DenSe-AdViT: A novel Vision Transformer for Dense SAR Object Detection

AnomalyControl: Learning Cross-modal Semantic Features for Controllable Anomaly Synthesis

The Mirage of Performance Gains: Why Contrastive Decoding Fails to Address Multimodal Hallucination

IReNe: Instant Recoloring of Neural Radiance Fields

LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations

DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation

Efficient Parameter Adaptation for Multi-Modal Medical Image Segmentation and Prognosis

Lightweight LiDAR-Camera 3D Dynamic Object Detection and Multi-Class Trajectory Prediction

最近の投稿

最近のコメント

アーカイブ

カテゴリー