投稿者「jarxiv」のアーカイブ

ReCLIP++: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation

投稿日: 2025年1月9日作成者: jarxiv

要約最近の研究では、CLIP を利用して、注釈のない画像のみを利用できる、困難 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time

投稿日: 2025年1月9日作成者: jarxiv

要約自動的に予測された人間のフィードバックを生成モデルのトレーニングプロセス … 続きを読む →

カテゴリー: cs.CL, cs.CV | コメントを受け付けていません

SplineFormer: An Explainable Transformer-Based Approach for Autonomous Endovascular Navigation

投稿日: 2025年1月9日作成者: jarxiv

要約血管内ナビゲーションは低侵襲処置の重要な側面であり、介入を成功させるにはガ … 続きを読む →

カテゴリー: cs.CV, cs.RO, eess.IV | コメントを受け付けていません

TSCM: A Teacher-Student Model for Vision Place Recognition Using Cross-Metric Knowledge Distillation

投稿日: 2025年1月9日作成者: jarxiv

要約視覚的場所認識 (VPR) は、複雑な屋外環境内での移動ロボットの自律探索 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Tutorial on Diffusion Models for Imaging and Vision

投稿日: 2025年1月9日作成者: jarxiv

要約近年の生成ツールの驚くべき成長により、テキストから画像の生成やテキストから … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Towards Fair Class-wise Robustness: Class Optimal Distribution Adversarial Training

投稿日: 2025年1月9日作成者: jarxiv

要約敵対的トレーニングは、敵対的攻撃に対するディープニューラルネットワーク … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

NeuralDiffuser: Neuroscience-inspired Diffusion Guidance for fMRI Visual Reconstruction

投稿日: 2025年1月9日作成者: jarxiv

要約機能的磁気共鳴画像法 fMRI から視覚刺激を再構成することで、脳活動をき … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.NE | コメントを受け付けていません

Embedding Similarity Guided License Plate Super Resolution

投稿日: 2025年1月9日作成者: jarxiv

要約超解像度 (SR) 技術は、特に正確なナンバープレート認識が重要なセキュ … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Combining YOLO and Visual Rhythm for Vehicle Counting

投稿日: 2025年1月9日作成者: jarxiv

要約ビデオベースの車両検出と計数は、交通インフラの管理において重要な役割を果た … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

投稿日: 2025年1月9日作成者: jarxiv

要約ビデオ大規模言語モデル (ビデオ LLM) は、最近、一般的なビデオの理解 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

ReCLIP++: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation

Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time

SplineFormer: An Explainable Transformer-Based Approach for Autonomous Endovascular Navigation

TSCM: A Teacher-Student Model for Vision Place Recognition Using Cross-Metric Knowledge Distillation

Tutorial on Diffusion Models for Imaging and Vision

Towards Fair Class-wise Robustness: Class Optimal Distribution Adversarial Training

NeuralDiffuser: Neuroscience-inspired Diffusion Guidance for fMRI Visual Reconstruction

Embedding Similarity Guided License Plate Super Resolution

Combining YOLO and Visual Rhythm for Vehicle Counting

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

最近の投稿

最近のコメント

アーカイブ

カテゴリー