月別アーカイブ: 2024年4月

MuseumMaker: Continual Style Customization without Catastrophic Forgetting

投稿日: 2024年4月26日作成者: jarxiv

要約適切なテキストプロンプトを備えた、事前トレーニングされた大規模な Tex … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Denoising: from classical methods to deep CNNs

投稿日: 2024年4月26日作成者: jarxiv

要約この論文は、教育学的方法で画像ノイズ除去の進化を探ることを目的としています … 続きを読む →

カテゴリー: cs.CV, math.HO | コメントを受け付けていません

DAVE — A Detect-and-Verify Paradigm for Low-Shot Counting

投稿日: 2024年4月26日作成者: jarxiv

要約ローショットカウンターは、画像内に注釈が付けられたサンプルがほとんどな … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Self-Balanced R-CNN for Instance Segmentation

投稿日: 2024年4月26日作成者: jarxiv

要約インスタンスセグメンテーションタスクに関する現在の最先端の 2 段階モ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning

投稿日: 2024年4月26日作成者: jarxiv

要約グラフは、複雑なデータの関係を示し、説明するために重要です。最近、マルチ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Zero-Shot Distillation for Image Encoders: How to Make Effective Use of Synthetic Data

投稿日: 2024年4月26日作成者: jarxiv

要約 CLIP などのマルチモーダル基礎モデルは、優れたゼロショット機能を実証し … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

PhyRecon: Physically Plausible Neural Scene Reconstruction

投稿日: 2024年4月26日作成者: jarxiv

要約ニューラル暗黙的表現はマルチビュー 3D 再構成で人気を集めていますが、こ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning

投稿日: 2024年4月26日作成者: jarxiv

要約ビジュアル命令チューニングは、タスク固有の命令を使用して事前トレーニングさ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding

投稿日: 2024年4月26日作成者: jarxiv

要約 CLIP などの視覚言語モデル (VLM) は、強力な画像テキスト理解能力 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior

投稿日: 2024年4月26日作成者: jarxiv

要約グレースケール画像をカラー化すると、魅力的な視覚体験が得られます。既存の … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

月別アーカイブ: 2024年4月

MuseumMaker: Continual Style Customization without Catastrophic Forgetting

Denoising: from classical methods to deep CNNs

DAVE — A Detect-and-Verify Paradigm for Low-Shot Counting

Self-Balanced R-CNN for Instance Segmentation

TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning

Zero-Shot Distillation for Image Encoders: How to Make Effective Use of Synthetic Data

PhyRecon: Physically Plausible Neural Scene Reconstruction

EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning

Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding

Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior

最近の投稿

最近のコメント

アーカイブ

カテゴリー