月別アーカイブ: 2024年1月

FactCHD: Benchmarking Fact-Conflicting Hallucination Detection

投稿日: 2024年1月19日作成者: jarxiv

要約 LLM はその優れた生成能力にもかかわらず、現実世界のアプリケーションでは … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV, cs.IR, cs.LG | コメントを受け付けていません

VIPTR: A Vision Permutable Extractor for Fast and Efficient Scene Text Recognition

投稿日: 2024年1月19日作成者: jarxiv

要約シーンテキスト認識 (STR) は、自然のシーンの画像内のテキストを認識 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Exposing Lip-syncing Deepfakes from Mouth Inconsistencies

投稿日: 2024年1月19日作成者: jarxiv

要約口パクディープフェイクはデジタル処理されたビデオで、AI モデルを使用して … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding

投稿日: 2024年1月19日作成者: jarxiv

要約 Contrastive language-image pre-traini … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Sub2Full: split spectrum to boost OCT despeckling without clean data

投稿日: 2024年1月19日作成者: jarxiv

要約光干渉断層撮影 (OCT) はスペックルノイズの影響を受け、特に可視光 … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Few-shot learning for COVID-19 Chest X-Ray Classification with Imbalanced Data: An Inter vs. Intra Domain Study

投稿日: 2024年1月19日作成者: jarxiv

要約医療画像データセットは、コンピューター支援診断、治療計画、医学研究で使用さ … 続きを読む →

カテゴリー: cs.CV, eess.IV | コメントを受け付けていません

Model Compression Techniques in Biometrics Applications: A Survey

投稿日: 2024年1月19日作成者: jarxiv

要約深層学習アルゴリズムの開発により、人類のタスク自動化能力が広範囲に強化され … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Hyperbolic Image-Text Representations

投稿日: 2024年1月19日作成者: jarxiv

要約視覚的および言語的概念は自然に階層構造に編成され、テキスト概念「犬」には犬 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Explicitly Disentangled Representations in Object-Centric Learning

投稿日: 2024年1月19日作成者: jarxiv

要約生の視覚データから構造化表現を抽出することは、機械学習における重要かつ長年 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation

投稿日: 2024年1月19日作成者: jarxiv

要約最近の大規模な事前トレーニング済み拡散モデルは、詳細なテキストの説明から高 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年1月

FactCHD: Benchmarking Fact-Conflicting Hallucination Detection

VIPTR: A Vision Permutable Extractor for Fast and Efficient Scene Text Recognition

Exposing Lip-syncing Deepfakes from Mouth Inconsistencies

UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding

Sub2Full: split spectrum to boost OCT despeckling without clean data

Few-shot learning for COVID-19 Chest X-Ray Classification with Imbalanced Data: An Inter vs. Intra Domain Study

Model Compression Techniques in Biometrics Applications: A Survey

Hyperbolic Image-Text Representations

Explicitly Disentangled Representations in Object-Centric Learning

Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation

最近の投稿

最近のコメント

アーカイブ

カテゴリー