Rethinking Momentum Knowledge Distillation in Online Continual Learning

要約

オンライン継続学習 (OCL) は、複数の分類タスクが順番に現れる連続データストリーム上でニューラルネットワークをトレーニングする問題に対処します。
オフラインの継続的学習とは対照的に、OCL ではデータは 1 回だけ表示されます。
これに関連して、リプレイベースの戦略は目覚ましい結果を達成しており、最先端のアプローチのほとんどはリプレイベースの戦略に大きく依存しています。
知識蒸留 (KD) はオフラインの継続学習で広く使用されていますが、その可能性にもかかわらず、OCL ではまだ十分に活用されていません。
この論文では、KD を OCL に適用する際の課題を理論的に分析します。
私たちは、Momentum Knowledge Distillation (MKD) を多くの主要な OCL メソッドに適用するための直接的かつ効果的な方法論を紹介し、既存のアプローチを強化するその機能を実証します。
ImageNet100 で既存の最先端の精度を $10\%$ ポイント以上改善することに加えて、MKD の内部機構と OCL でのトレーニング中の影響を明らかにしました。
私たちは、リプレイと同様に、MKD も OCL の中心的なコンポーネントとして考慮されるべきであると主張します。

要約(オリジナル)

Online Continual Learning (OCL) addresses the problem of training neural networks on a continuous data stream where multiple classification tasks emerge in sequence. In contrast to offline Continual Learning, data can be seen only once in OCL. In this context, replay-based strategies have achieved impressive results and most state-of-the-art approaches are heavily depending on them. While Knowledge Distillation (KD) has been extensively used in offline Continual Learning, it remains under-exploited in OCL, despite its potential. In this paper, we theoretically analyze the challenges in applying KD to OCL. We introduce a direct yet effective methodology for applying Momentum Knowledge Distillation (MKD) to many flagship OCL methods and demonstrate its capabilities to enhance existing approaches. In addition to improving existing state-of-the-arts accuracy by more than $10\%$ points on ImageNet100, we shed light on MKD internal mechanics and impacts during training in OCL. We argue that similar to replay, MKD should be considered a central component of OCL.

arxiv情報

著者	Nicolas Michel,Maorong Wang,Ling Xiao,Toshihiko Yamasaki
発行日	2023-09-06 09:49:20+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Rethinking Momentum Knowledge Distillation in Online Continual Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー