Optimal transfer protocol by incremental layer defrosting

要約

転移学習は、限られた量のデータでモデルのトレーニングを可能にする強力なツールです。
この手法は、データの可用性がしばしば深刻な制限となる現実世界の問題で特に役立ちます。
最も単純な転移学習プロトコルは、データが豊富なソースタスクで事前トレーニングされたネットワークの特徴抽出レイヤーを「凍結」し、最後のレイヤーのみをデータの少ないターゲットタスクに適応させることに基づいています。
このワークフローは、事前トレーニング済みモデルの特徴マップが、ターゲットタスクで十分なデータを使用して学習されたものと質的に類似しているという前提に基づいています。
この作業では、このプロトコルが最適ではないことが多く、事前にトレーニングされたネットワークの小さな部分を凍結したままにしておくと、パフォーマンスが最大になる可能性があることを示しています。
特に、制御されたフレームワークを利用して、最適な転送深度を特定します。これは、利用可能なトレーニングデータの量と、ソースとターゲットのタスク相関の程度に大きく依存することが判明しました。
次に、ソースタスクとターゲットタスクでゼロからトレーニングされた 2 つのネットワークの内部表現を、複数の確立された類似性測定を通じて分析することにより、転送の最適性を特徴付けます。

要約(オリジナル)

Transfer learning is a powerful tool enabling model training with limited amounts of data. This technique is particularly useful in real-world problems where data availability is often a serious limitation. The simplest transfer learning protocol is based on “freezing’ the feature-extractor layers of a network pre-trained on a data-rich source task, and then adapting only the last layers to a data-poor target task. This workflow is based on the assumption that the feature maps of the pre-trained model are qualitatively similar to the ones that would have been learned with enough data on the target task. In this work, we show that this protocol is often sub-optimal, and the largest performance gain may be achieved when smaller portions of the pre-trained network are kept frozen. In particular, we make use of a controlled framework to identify the optimal transfer depth, which turns out to depend non-trivially on the amount of available training data and on the degree of source-target task correlation. We then characterize transfer optimality by analyzing the internal representations of two networks trained from scratch on the source and the target task through multiple established similarity measures.

arxiv情報

著者	Federica Gerace,Diego Doimo,Stefano Sarao Mannelli,Luca Saglietti,Alessandro Laio
発行日	2023-03-02 17:32:11+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Optimal transfer protocol by incremental layer defrosting

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー