Gated Class-Attention with Cascaded Feature Drift Compensation for Exemplar-free Continual Learning of Vision Transformers

要約

この論文では、ViT の模範を含まないクラス増分トレーニングの新しい方法を提案します。
手本のない継続的な学習の主な課題は、以前に学習したタスクを壊滅的に忘れることなく、学習者の可塑性を維持することです。
これは、多くの場合、新しいタスクを学習するときに発生する機能ドリフトに対して以前のタスク分類器を再調整するのに役立つ模範的なリプレイによって達成されます。
ただし、模範的な再生には、前のタスクからのサンプルを保持するという代償が伴います。これは、一部のアプリケーションでは不可能な場合があります。
継続的な ViT トレーニングの問題に対処するために、最初に、最終的な ViT 変換ブロックのドリフトを最小限に抑えるために、ゲートクラス注意を提案します。
このマスクベースのゲーティングは、最後の Transformer ブロックのクラスアテンションメカニズムに適用され、前のタスクに重要な重みを強力に調整します。
第二に、新しいタスクを学習するときにバックボーンの機能ドリフトに対応する機能ドリフト補償の新しい方法を提案します。
ゲートクラス注意とカスケード機能ドリフト補償の組み合わせにより、以前のタスクの忘却を制限しながら、新しいタスクへの可塑性が可能になります。
CIFAR-100、Tiny-ImageNet、および ImageNet100 で実行された広範な実験は、過去のタスクの代表的な模範を保存する必要なく、既存の模範のない最先端の方法よりも優れていることを示しています。

要約(オリジナル)

In this paper we propose a new method for exemplar-free class incremental training of ViTs. The main challenge of exemplar-free continual learning is maintaining plasticity of the learner without causing catastrophic forgetting of previously learned tasks. This is often achieved via exemplar replay which can help recalibrate previous task classifiers to the feature drift which occurs when learning new tasks. Exemplar replay, however, comes at the cost of retaining samples from previous tasks which for some applications may not be possible. To address the problem of continual ViT training, we first propose gated class-attention to minimize the drift in the final ViT transformer block. This mask-based gating is applied to class-attention mechanism of the last transformer block and strongly regulates the weights crucial for previous tasks. Secondly, we propose a new method of feature drift compensation that accommodates feature drift in the backbone when learning new tasks. The combination of gated class-attention and cascaded feature drift compensation allows for plasticity towards new tasks while limiting forgetting of previous ones. Extensive experiments performed on CIFAR-100, Tiny-ImageNet and ImageNet100 demonstrate that our method outperforms existing exemplar-free state-of-the-art methods without the need to store any representative exemplars of past tasks.

arxiv情報

著者	Marco Cotogni,Fei Yang,Claudio Cusano,Andrew D. Bagdanov,Joost van de Weijer
発行日	2022-11-22 14:13:15+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Gated Class-Attention with Cascaded Feature Drift Compensation for Exemplar-free Continual Learning of Vision Transformers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー