Continual Learning of Numerous Tasks from Long-tail Distributions

要約

人工知能と機械学習研究の重要な側面である継続学習は、以前に獲得した知識を保持しながら新しいタスクを学習し適応するモデルの開発に焦点を当てている。既存の継続学習アルゴリズムは、通常、一様な大きさの少数のタスクを含んでおり、実世界の学習シナリオを正確に表現していない可能性がある。本稿では、タスクのサイズがロングテールであるタスク分布から引き出された多数のタスクを用いて、継続学習アルゴリズムの性能を調査する。このような設定における既存のアルゴリズムの性能を評価するために、1つの合成データセットと2つの実世界の継続学習データセットを設計する。さらに、継続学習において見過ごされている要因であるオプティマイザの状態、例えばAdamオプティマイザにおける一次モーメントと二次モーメントを研究し、継続学習の性能を向上させるためにどのように利用できるかを調べる。我々は、以前のタスクからのセカンドモーメントの加重平均を維持することで、アダムのオプティマイザ状態を再利用する方法を提案する。既存のほとんどの継続的学習アルゴリズムと互換性のある我々の手法が、わずかな計算コストやメモリコストを追加するだけで、効果的に忘却を減少させ、特にロングテールのタスクシーケンスにおいて、既存の継続的学習アルゴリズムをさらに改善することを実証する。

要約(オリジナル)

Continual learning, an important aspect of artificial intelligence and machine learning research, focuses on developing models that learn and adapt to new tasks while retaining previously acquired knowledge. Existing continual learning algorithms usually involve a small number of tasks with uniform sizes and may not accurately represent real-world learning scenarios. In this paper, we investigate the performance of continual learning algorithms with a large number of tasks drawn from a task distribution that is long-tail in terms of task sizes. We design one synthetic dataset and two real-world continual learning datasets to evaluate the performance of existing algorithms in such a setting. Moreover, we study an overlooked factor in continual learning, the optimizer states, e.g. first and second moments in the Adam optimizer, and investigate how it can be used to improve continual learning performance. We propose a method that reuses the optimizer states in Adam by maintaining a weighted average of the second moments from previous tasks. We demonstrate that our method, compatible with most existing continual learning algorithms, effectively reduces forgetting with only a small amount of additional computational or memory costs, and provides further improvements on existing continual learning algorithms, particularly in a long-tail task sequence.

arxiv情報

著者	Liwei Kang,Wee Sun Lee
発行日	2024-04-03 13:56:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Continual Learning of Numerous Tasks from Long-tail Distributions

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー