The Role of Pre-training Data in Transfer Learning

要約

モデルの事前トレーニングとその後の微調整の転移学習パラダイムにより、高精度のモデルが生成されます。
ほとんどの研究では、事前トレーニングのサイズをスケーリングして転移学習のメリットを最大限に活用することを推奨していますが、疑問が残ります: 事前トレーニングにはどのようなデータと方法を使用する必要があるのでしょうか?
3 つの事前トレーニング方法 (教師あり、対照的な言語画像および画像画像)、7 つの事前トレーニングデータセット、および 9 つのダウンストリームデータセットを使用して、少数ショットおよび完全な微調整パフォーマンスに対する事前トレーニングデータ配布の影響を調査します。
.
広範な制御実験を通じて、事前トレーニングデータソースの選択は少数ショットの転送に不可欠であることがわかりましたが、微調整に使用できるデータが増えるにつれて、その役割は減少します。
さらに、データキュレーションの役割を調べ、ラベルノイズとトレーニング前のデータセットのサイズとのトレードオフを調べます。
LAION からの 2000 倍以上の事前トレーニングデータを使用すると、教師あり ImageNet 事前トレーニングのパフォーマンスに匹敵することがわかります。
さらに、言語と画像のコントラストと画像と画像のコントラストを比較して、事前トレーニング方法の効果を調査し、後者の方が下流の精度が向上することを発見しました。

要約(オリジナル)

The transfer learning paradigm of model pre-training and subsequent fine-tuning produces high-accuracy models. While most studies recommend scaling the pre-training size to benefit most from transfer learning, a question remains: what data and method should be used for pre-training? We investigate the impact of pre-training data distribution on the few-shot and full fine-tuning performance using 3 pre-training methods (supervised, contrastive language-image and image-image), 7 pre-training datasets, and 9 downstream datasets. Through extensive controlled experiments, we find that the choice of the pre-training data source is essential for the few-shot transfer, but its role decreases as more data is made available for fine-tuning. Additionally, we explore the role of data curation and examine the trade-offs between label noise and the size of the pre-training dataset. We find that using 2000X more pre-training data from LAION can match the performance of supervised ImageNet pre-training. Furthermore, we investigate the effect of pre-training methods, comparing language-image contrastive vs. image-image contrastive, and find that the latter leads to better downstream accuracy

arxiv情報

著者	Rahim Entezari,Mitchell Wortsman,Olga Saukh,M. Moein Shariatnia,Hanie Sedghi,Ludwig Schmidt
発行日	2023-03-01 13:48:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

The Role of Pre-training Data in Transfer Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー