Towards Redundancy-Free Sub-networks in Continual Learning

要約

壊滅的な忘却（CF）は、継続的な学習における顕著な問題です。
パラメーター分離は、各タスクのサブネットワークをマスクして古いタスクとの干渉を軽減することで、この課題に対処します。
しかし、これらのサブネットワークは重みの大きさに基づいて構築されるため、重みの重要性と必ずしも対応するわけではなく、重要でない重みが維持され、冗長なサブネットワークが構築されることになります。
この制限を克服するために、隣接するネットワーク層間の冗長性を取り除く情報ボトルネックにヒントを得て、\textbf{\underline{I}nformation \underline{B}ottleneck \underline{M}asked sub-network (IBM)} を提案します。
サブネットワーク内の冗長性。
具体的には、IBM は貴重な情報を必須の重みに蓄積して冗長性のないサブネットワークを構築し、サブネットワークを凍結することで CF を効果的に軽減するだけでなく、貴重な知識の伝達を通じて新しいタスクのトレーニングを促進します。
さらに、IBM は非表示の表現を分解して構築プロセスを自動化し、柔軟性を高めます。
広範な実験により、IBM が常に最先端の手法を上回るパフォーマンスを示していることが実証されています。
特に、IBM は、サブネットワーク内のパラメータ数を 70% 削減し、トレーニング時間を 80% 削減することで、最先端のパラメータ分離手法を上回っています。

要約(オリジナル)

Catastrophic Forgetting (CF) is a prominent issue in continual learning. Parameter isolation addresses this challenge by masking a sub-network for each task to mitigate interference with old tasks. However, these sub-networks are constructed relying on weight magnitude, which does not necessarily correspond to the importance of weights, resulting in maintaining unimportant weights and constructing redundant sub-networks. To overcome this limitation, inspired by information bottleneck, which removes redundancy between adjacent network layers, we propose \textbf{\underline{I}nformation \underline{B}ottleneck \underline{M}asked sub-network (IBM)} to eliminate redundancy within sub-networks. Specifically, IBM accumulates valuable information into essential weights to construct redundancy-free sub-networks, not only effectively mitigating CF by freezing the sub-networks but also facilitating new tasks training through the transfer of valuable knowledge. Additionally, IBM decomposes hidden representations to automate the construction process and make it flexible. Extensive experiments demonstrate that IBM consistently outperforms state-of-the-art methods. Notably, IBM surpasses the state-of-the-art parameter isolation method with a 70\% reduction in the number of parameters within sub-networks and an 80\% decrease in training time.

arxiv情報

著者	Cheng Chen,Jingkuan Song,LianLi Gao,Heng Tao Shen
発行日	2024-01-11 14:44:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Towards Redundancy-Free Sub-networks in Continual Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー