Unifying Synergies between Self-supervised Learning and Dynamic Computation

要約

計算コストのかかるトレーニング戦略により、リソースに制約のある産業環境では自己教師あり学習 (SSL) が現実的ではなくなります。
軽量モデルを取得するには、知識蒸留 (KD)、動的計算 (DC)、枝刈りなどの手法がよく使用されます。通常、これには、大規模な事前トレーニング済みモデルの複数のエポックによる微調整 (または蒸留ステップ) が含まれ、より計算能力の高いモデルになります。
挑戦的。
この研究では、SSL と DC パラダイムの間の相互作用に関する新しい視点を提示します。
特に、追加の微調整や枝刈りの手順を行わずに、SSL 設定で高密度のゲートサブネットワークを最初から同時に学習することが実現可能であることを示します。
高密度エンコーダーとゲートエンコーダーの両方の事前トレーニング中の共進化により、精度と効率の優れたトレードオフが実現され、アプリケーション固有の産業環境向けの汎用的で多目的なアーキテクチャが得られます。
CIFAR-10/100、STL-10、ImageNet-100 などのいくつかの画像分類ベンチマークに関する広範な実験により、提案されたトレーニング戦略が、バニラの自己教師ありと比較して同等のパフォーマンスを達成する、高密度で対応するゲート型サブネットワークを提供することが実証されました。
設定は変わりませんが、ターゲットバジェット (td ) の範囲内では、FLOP の観点から計算量が大幅に削減されます。

要約(オリジナル)

Computationally expensive training strategies make self-supervised learning (SSL) impractical for resource constrained industrial settings. Techniques like knowledge distillation (KD), dynamic computation (DC), and pruning are often used to obtain a lightweightmodel, which usually involves multiple epochs of fine-tuning (or distilling steps) of a large pre-trained model, making it more computationally challenging. In this work we present a novel perspective on the interplay between SSL and DC paradigms. In particular, we show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting without any additional fine-tuning or pruning steps. The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off and therefore yields a generic and multi-purpose architecture for application specific industrial settings. Extensive experiments on several image classification benchmarks including CIFAR-10/100, STL-10 and ImageNet-100, demonstrate that the proposed training strategy provides a dense and corresponding gated sub-network that achieves on-par performance compared with the vanilla self-supervised setting, but at a significant reduction in computation in terms of FLOPs, under a range of target budgets (td ).

arxiv情報

著者	Tarun Krishna,Ayush K Rai,Alexandru Drimbarean,Eric Arazo,Paul Albert,Alan F Smeaton,Kevin McGuinness,Noel E O’Connor
発行日	2023-09-06 12:15:06+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Unifying Synergies between Self-supervised Learning and Dynamic Computation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー