月別アーカイブ: 2025年4月

Cross-Hierarchical Bidirectional Consistency Learning for Fine-Grained Visual Classification

投稿日: 2025年4月21日作成者: jarxiv

要約 Fine-Grained Visual分類（FGVC）は、密接に関連するサ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Compile Scene Graphs with Reinforcement Learning

投稿日: 2025年4月21日作成者: jarxiv

要約次のトークン予測は、大規模な言語モデル（LLMS）をトレーニングするための … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Visual Intention Grounding for Egocentric Assistants

投稿日: 2025年4月21日作成者: jarxiv

要約 Visual Groundingは、テキストの説明を画像内のオブジェクトと … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

SupResDiffGAN a new approach for the Super-Resolution task

投稿日: 2025年4月21日作成者: jarxiv

要約この作業では、超解像度タスクの生成的敵対ネットワーク（GANS）と拡散モデ … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

DenSe-AdViT: A novel Vision Transformer for Dense SAR Object Detection

投稿日: 2025年4月21日作成者: jarxiv

要約視覚変圧器（VIT）は、グローバルな特徴を抽出する特別な能力により、合成開 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

AnomalyControl: Learning Cross-modal Semantic Features for Controllable Anomaly Synthesis

投稿日: 2025年4月21日作成者: jarxiv

要約異常合成は、異常検査を進めるための異常なデータを増強するための重要なアプロ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

The Mirage of Performance Gains: Why Contrastive Decoding Fails to Address Multimodal Hallucination

投稿日: 2025年4月21日作成者: jarxiv

要約対照的なデコード戦略は、マルチモーダルの大手言語モデル（MLLM）の幻覚を … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

IReNe: Instant Recoloring of Neural Radiance Fields

投稿日: 2025年4月21日作成者: jarxiv

要約 NERFの進歩により、3Dシーンの再構築と新しいビューの合成が可能になりま … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations

投稿日: 2025年4月21日作成者: jarxiv

要約対照的なインスタンス識別方法は、画像分類やオブジェクト検出などの下流タ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation

投稿日: 2025年4月21日作成者: jarxiv

要約テキストからイメージ（T2I）生成モデルは、近年大幅に進歩しています。た … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2025年4月

Cross-Hierarchical Bidirectional Consistency Learning for Fine-Grained Visual Classification

Compile Scene Graphs with Reinforcement Learning

Visual Intention Grounding for Egocentric Assistants

SupResDiffGAN a new approach for the Super-Resolution task

DenSe-AdViT: A novel Vision Transformer for Dense SAR Object Detection

AnomalyControl: Learning Cross-modal Semantic Features for Controllable Anomaly Synthesis

The Mirage of Performance Gains: Why Contrastive Decoding Fails to Address Multimodal Hallucination

IReNe: Instant Recoloring of Neural Radiance Fields

LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations

DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation

最近の投稿

最近のコメント

アーカイブ

カテゴリー