ViTKD: Practical Guidelines for ViT feature knowledge distillation

要約

畳み込みニューラルネットワーク(CNN)の知識抽出(KD)は、小さなモデルの性能を高める方法として、広く研究されている。近年、Vision Transformer (ViT) が多くのコンピュータビジョンタスクで大きな成功を収めており、ViTのためのKDも望まれている。しかし，出力ロジットに基づくKD以外にも，CNNの特徴量に基づくKDは構造上のギャップが大きいため，ViTに直接適用することができない．本論文では、ViTのための特徴量に基づく蒸留の方法を探る。ViTの特徴マップの性質に基づき、一連の制御実験を設計し、ViTの特徴抽出のための3つの実用的なガイドラインを導出する。その結果、CNN時代の手法とは正反対の知見を得た。3つの指針に基づき、我々は特徴に基づく手法ViTKDを提案し、生徒に一貫した大幅な改善をもたらす。ImageNet-1kにおいて、DeiT-Tinyを74.42%から76.06%に、DeiT-Smallを80.55%から81.95%に、DeiT-Baseを81.76%から83.46%に向上させることができました。さらに、ViTKDとロジットベースのKD法は補完関係にあり、直接併用することが可能である。この組み合わせにより、生徒の成績をさらに向上させることができます。具体的には、生徒のDeiT-Tiny、Small、Baseはそれぞれ77.78%、83.59%、85.41%を達成する。コードは https://github.com/yzd-v/cls_KD で公開されています。

要約(オリジナル)

Knowledge Distillation (KD) for Convolutional Neural Network (CNN) is extensively studied as a way to boost the performance of a small model. Recently, Vision Transformer (ViT) has achieved great success on many computer vision tasks and KD for ViT is also desired. However, besides the output logit-based KD, other feature-based KD methods for CNNs cannot be directly applied to ViT due to the huge structure gap. In this paper, we explore the way of feature-based distillation for ViT. Based on the nature of feature maps in ViT, we design a series of controlled experiments and derive three practical guidelines for ViT’s feature distillation. Some of our findings are even opposite to the practices in the CNN era. Based on the three guidelines, we propose our feature-based method ViTKD which brings consistent and considerable improvement to the student. On ImageNet-1k, we boost DeiT-Tiny from 74.42% to 76.06%, DeiT-Small from 80.55% to 81.95%, and DeiT-Base from 81.76% to 83.46%. Moreover, ViTKD and the logit-based KD method are complementary and can be applied together directly. This combination can further improve the performance of the student. Specifically, the student DeiT-Tiny, Small, and Base achieve 77.78%, 83.59%, and 85.41%, respectively. The code is available at https://github.com/yzd-v/cls_KD.

arxiv情報

著者	Zhendong Yang,Zhe Li,Ailing Zeng,Zexian Li,Chun Yuan,Yu Li
発行日	2022-09-06 11:52:46+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

ViTKD: Practical Guidelines for ViT feature knowledge distillation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー