Knowledge Distillation from A Stronger Teacher

要約

ベースライン設定に焦点を当てた既存の知識蒸留法とは異なり、教師モデルとトレーニング戦略はそれほど強力ではなく、最先端のアプローチと競合するため、この論文では、より強力な教師からよりよく抽出するための DIST と呼ばれる方法を紹介します。
経験的に、生徒とより強い教師との間の予測の不一致は、かなり深刻になる傾向があることがわかっています。
その結果、KL ダイバージェンスの予測が正確に一致すると、トレーニングが妨げられ、既存の方法のパフォーマンスが低下します。
この論文では、教師と生徒の予測の間の関係を単に保存するだけで十分であることを示し、相関ベースの損失を提案して、教師から固有のクラス間関係を明示的にキャプチャします。
さらに、異なるインスタンスは各クラスに対して異なる意味的類似性を持っていることを考慮して、このリレーショナルマッチをクラス内レベルに拡張します。
私たちの方法はシンプルですが実用的であり、広範な実験により、さまざまなアーキテクチャ、モデルサイズ、およびトレーニング戦略にうまく適応し、画像分類、オブジェクト検出、およびセマンティックセグメンテーションタスクで最先端のパフォーマンスを一貫して達成できることが実証されています。
コードは https://github.com/hunto/DIST_KD で入手できます。

要約(オリジナル)

Unlike existing knowledge distillation methods focus on the baseline settings, where the teacher models and training strategies are not that strong and competing as state-of-the-art approaches, this paper presents a method dubbed DIST to distill better from a stronger teacher. We empirically find that the discrepancy of predictions between the student and a stronger teacher may tend to be fairly severer. As a result, the exact match of predictions in KL divergence would disturb the training and make existing methods perform poorly. In this paper, we show that simply preserving the relations between the predictions of teacher and student would suffice, and propose a correlation-based loss to capture the intrinsic inter-class relations from the teacher explicitly. Besides, considering that different instances have different semantic similarities to each class, we also extend this relational match to the intra-class level. Our method is simple yet practical, and extensive experiments demonstrate that it adapts well to various architectures, model sizes and training strategies, and can achieve state-of-the-art performance consistently on image classification, object detection, and semantic segmentation tasks. Code is available at: https://github.com/hunto/DIST_KD .

arxiv情報

著者	Tao Huang,Shan You,Fei Wang,Chen Qian,Chang Xu
発行日	2022-12-28 04:02:19+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Knowledge Distillation from A Stronger Teacher

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー