Enhancing Vision-Language Few-Shot Adaptation with Negative Learning

要約

大規模な事前トレーニング済み視覚言語モデル (VLM) は、優れたゼロショットパフォーマンスと転送性を示し、データ効率の高い方法で下流のタスクに適応することができます。
ただし、ラベル付きサンプルが少数しか利用できない場合、特定の下流タスクで類似したクラス間の微妙な違いを区別するために VLM を適応させることは依然として困難です。
この研究では、数ショットのラベル付きサンプルからタスク固有の知識をより効率的に活用する、シンプルかつ効果的なネガティブ学習アプローチである SimNL を提案します。
「{CLASS} とは何か」を定義する一連の代表的な肯定的な特徴を特定することに焦点を当てた以前の方法とは異なり、SimNL は「何が {CLASS} ではないのか」を定義する補完的な一連の否定的な特徴を発見し、肯定的な特徴を補足する追加の洞察を提供します。
タスク固有の認識能力を強化する機能。
さらに、現在の適応アプローチは、ショット数が少ないサンプルセットの潜在的なノイズに対して特に脆弱であることを確認しました。
この問題を軽減するために、ノイズの多い外れ値を抑制し、より安定した適応のためにクリーンなサンプルを増幅する、プラグアンドプレイの少数ショットインスタンスの再重み付け手法を導入します。
15 のデータセットにわたる広範な実験結果により、提案された SimNL が少数ショット学習タスクと領域汎化タスクの両方で既存の最先端の手法を上回り、競争力のある計算効率を達成できることが検証されています。
コードは https://github.com/zhangce01/SimNL で入手できます。

要約(オリジナル)

Large-scale pre-trained Vision-Language Models (VLMs) have exhibited impressive zero-shot performance and transferability, allowing them to adapt to downstream tasks in a data-efficient manner. However, when only a few labeled samples are available, adapting VLMs to distinguish subtle differences between similar classes in specific downstream tasks remains challenging. In this work, we propose a Simple yet effective Negative Learning approach, SimNL, to more efficiently exploit the task-specific knowledge from few-shot labeled samples. Unlike previous methods that focus on identifying a set of representative positive features defining ‘what is a {CLASS}’, SimNL discovers a complementary set of negative features that define ‘what is not a {CLASS}’, providing additional insights that supplement the positive features to enhance task-specific recognition capability. Further, we identify that current adaptation approaches are particularly vulnerable to potential noise in the few-shot sample set. To mitigate this issue, we introduce a plug-and-play few-shot instance reweighting technique to suppress noisy outliers and amplify clean samples for more stable adaptation. Our extensive experimental results across 15 datasets validate that the proposed SimNL outperforms existing state-of-the-art methods on both few-shot learning and domain generalization tasks while achieving competitive computational efficiency. Code is available at https://github.com/zhangce01/SimNL.

arxiv情報

著者	Ce Zhang,Simon Stepputtis,Katia Sycara,Yaqi Xie
発行日	2024-11-08 14:58:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Enhancing Vision-Language Few-Shot Adaptation with Negative Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー