月別アーカイブ: 2023年6月

Global and Local Semantic Completion Learning for Vision-Language Pre-training

投稿日: 2023年6月13日作成者: jarxiv

要約クロスモーダルアライメントは、視覚言語事前トレーニング (VLP) モデ … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Rotation and Translation Invariant Representation Learning with Implicit Neural Representations

投稿日: 2023年6月13日作成者: jarxiv

要約多くのコンピュータビジョンアプリケーションでは、画像は任意またはランダ … 続きを読む →

カテゴリー: cs.AI, cs.CV | コメントを受け付けていません

Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases

投稿日: 2023年6月13日作成者: jarxiv

要約擬似相関 (SC) の問題は、分類器がトレーニングデータ内のラベルと偶然 … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

InstructP2P: Learning to Edit 3D Point Clouds with Text Instructions

投稿日: 2023年6月13日作成者: jarxiv

要約人間の指示に従ってタスクを実行できるように AI システムを強化すると、生 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

Frequency-Based Vulnerability Analysis of Deep Learning Models against Image Corruptions

投稿日: 2023年6月13日作成者: jarxiv

要約深層学習モデルは、現実世界の画像破損を処理する際に課題に直面することがよく … 続きを読む →

カテゴリー: cs.AI, cs.CR, cs.CV, cs.LG | コメントを受け付けていません

CD-CTFM: A Lightweight CNN-Transformer Network for Remote Sensing Cloud Detection Fusing Multiscale Features

投稿日: 2023年6月13日作成者: jarxiv

要約リモートセンシング画像に含まれる雲は情報抽出に必ず影響を及ぼし、その後の衛 … 続きを読む →

カテゴリー: cs.CV, cs.LG, eess.IV | コメントを受け付けていません

Retrieval-Enhanced Contrastive Vision-Text Models

投稿日: 2023年6月13日作成者: jarxiv

要約 CLIP などの対照的な画像テキストモデルは、多くの最先端システムの構成 … 続きを読む →

カテゴリー: cs.CV | コメントを受け付けていません

AROID: Improving Adversarial Robustness through Online Instance-wise Data Augmentation

投稿日: 2023年6月13日作成者: jarxiv

要約ディープニューラルネットワークは、敵対的な例に対して脆弱です。敵対的 … 続きを読む →

カテゴリー: cs.AI, cs.CV, cs.LG | コメントを受け付けていません

Fill-Up: Balancing Long-Tailed Data with Generative Models

投稿日: 2023年6月13日作成者: jarxiv

要約最新のテキストから画像への合成モデルは、並外れたレベルのフォトリアリズムを … 続きを読む →

カテゴリー: cs.CV, cs.LG | コメントを受け付けていません

Valley: Video Assistant with Large Language model Enhanced abilitY

投稿日: 2023年6月13日作成者: jarxiv

要約最近、画像と言語を共同理解するためにいくつかのマルチモーダルモデルが開発 … 続きを読む →

カテゴリー: cs.AI, cs.CL, cs.CV | コメントを受け付けていません

月別アーカイブ: 2023年6月

Global and Local Semantic Completion Learning for Vision-Language Pre-training

Rotation and Translation Invariant Representation Learning with Implicit Neural Representations

Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases

InstructP2P: Learning to Edit 3D Point Clouds with Text Instructions

Frequency-Based Vulnerability Analysis of Deep Learning Models against Image Corruptions

CD-CTFM: A Lightweight CNN-Transformer Network for Remote Sensing Cloud Detection Fusing Multiscale Features

Retrieval-Enhanced Contrastive Vision-Text Models

AROID: Improving Adversarial Robustness through Online Instance-wise Data Augmentation

Fill-Up: Balancing Long-Tailed Data with Generative Models

Valley: Video Assistant with Large Language model Enhanced abilitY

最近の投稿

最近のコメント

アーカイブ

カテゴリー