投稿者「jarxiv」のアーカイブ

Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels

投稿日: 2025年2月28日作成者: jarxiv

要約制約されていない現実世界環境での正確な3D視線推定は、外観、ヘッドポーズ、 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Deep Modeling of Non-Gaussian Aleatoric Uncertainty

投稿日: 2025年2月28日作成者: jarxiv

要約ディープラーニングは、特に不確実性分布が固定およびガウスの伝統的な仮定に適 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG, cs.RO | コメントを受け付けていません

Do computer vision foundation models learn the low-level characteristics of the human visual system?

投稿日: 2025年2月28日作成者: jarxiv

要約 DinoやOpenClipなどのコンピュータービジョンファンデーションモデ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Vector-Quantized Vision Foundation Models for Object-Centric Learning

投稿日: 2025年2月28日作成者: jarxiv

要約視覚的なシーンをオブジェクトに分解すると、人間がそうであるように、オブジェ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

HVI: A New color space for Low-light Image Enhancement

投稿日: 2025年2月28日作成者: jarxiv

要約 Low-light Image Enhancement（LLIE）は、破損 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Explainable, Multi-modal Wound Infection Classification from Images Augmented with Generated Captions

投稿日: 2025年2月28日作成者: jarxiv

要約糖尿病の足潰瘍（DFU）の感染症は、組織死や四肢切断を含む重度の合併症を引 … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Visual Adaptive Prompting for Compositional Zero-Shot Learning

投稿日: 2025年2月28日作成者: jarxiv

要約 Vision-Language Models（VLMS）は、視覚データとテ … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription

投稿日: 2025年2月28日作成者: jarxiv

要約手書きのテキスト認識（HTR）は、特にページが共通のフォーマットとコンテキ … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

A Dataset and Framework for Learning State-invariant Object Representations

投稿日: 2025年2月28日作成者: jarxiv

要約認識と検索のためにオブジェクト表現を学習するために、より一般的に使用される … 続きを読む →

カテゴリー: cs.CV, cs.IR, cs.LG | コメントを受け付けていません

M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging

投稿日: 2025年2月28日作成者: jarxiv

要約エージェントAIシステムは、複雑なタスクを自律的に実行する能力について大き … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

投稿者「jarxiv」のアーカイブ

Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels

Deep Modeling of Non-Gaussian Aleatoric Uncertainty

Do computer vision foundation models learn the low-level characteristics of the human visual system?

Vector-Quantized Vision Foundation Models for Object-Centric Learning

HVI: A New color space for Low-light Image Enhancement

Explainable, Multi-modal Wound Infection Classification from Images Augmented with Generated Captions

Visual Adaptive Prompting for Compositional Zero-Shot Learning

Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription

A Dataset and Framework for Learning State-invariant Object Representations

M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging

最近の投稿

最近のコメント

アーカイブ

カテゴリー