What Makes Good Examples for Visual In-Context Learning?

要約

広範なデータでトレーニングされた大規模モデルは、強力な一般化パフォーマンスにより、最近、コンピュータービジョンの主流アーキテクチャになりました。
このホワイトペーパーでは、主な焦点は、コンテキスト内学習として知られる大規模なビジョンモデルの創発的能力です。これにより、モデルパラメーターを更新せずに、コンテキスト内の例 (別名 ~prompt) を条件付けすることで、目に見えないタスクの推論が可能になります。
この概念は自然言語処理ではよく知られていますが、大規模なビジョンモデルについてはごく最近になって研究されたにすぎません。
コンピュータービジョンにおけるコンテキスト内の例の影響に関する包括的な調査を初めて提供し、パフォーマンスがコンテキスト内の例の選択に非常に敏感であることを発見しました。
この問題を克服するために、コンテキスト内の例の選択を自動化するための迅速な検索フレームワークを提案します。
具体的には、(1) 既製のモデルを使用した最も近いサンプル検索に基づく教師なしプロンプト検索方法、および (2) インコンテキストを直接最大化するサンプルを選択するようにニューラルネットワークをトレーニングする教師ありプロンプト検索方法を提示します。
学習パフォーマンス。
結果は、私たちの方法が、一般的に使用されているランダム選択と比較して、視覚的なコンテキスト学習に自明ではない改善をもたらすことができることを示しています.

要約(オリジナル)

Large-scale models trained on broad data have recently become the mainstream architecture in computer vision due to their strong generalization performance. In this paper, the main focus is on an emergent ability in large vision models, known as in-context learning, which allows inference on unseen tasks by conditioning on in-context examples (a.k.a.~prompt) without updating the model parameters. This concept has been well-known in natural language processing but has only been studied very recently for large vision models. We for the first time provide a comprehensive investigation on the impact of in-context examples in computer vision, and find that the performance is highly sensitive to the choice of in-context examples. To overcome the problem, we propose a prompt retrieval framework to automate the selection of in-context examples. Specifically, we present (1) an unsupervised prompt retrieval method based on nearest example search using an off-the-shelf model, and (2) a supervised prompt retrieval method, which trains a neural network to choose examples that directly maximize in-context learning performance. The results demonstrate that our methods can bring non-trivial improvements to visual in-context learning in comparison to the commonly-used random selection.

arxiv情報

著者	Yuanhan Zhang,Kaiyang Zhou,Ziwei Liu
発行日	2023-01-31 14:40:05+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

What Makes Good Examples for Visual In-Context Learning?

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー