Model agnostic methods meta-learn despite misspecifications

要約

数ショット分類と強化学習での経験的な成功により、メタ学習は最近多くの関心を集めました。
メタ学習は、以前のタスクからのデータを活用して、データが限られているにもかかわらず、新しいタスクをすばやく学習します。
特に、モデルにとらわれないメソッドは、勾配降下法が新しいタスクにすばやく適応する初期化ポイントを探します。
このような方法はトレーニング中に適切な共有表現を学習することが経験的に示唆されていますが、そのような動作の強力な理論的証拠はありません。
さらに重要なことに、これらのメソッドが本当にモデルにとらわれないかどうか、つまり、アーキテクチャの仕様ミスにもかかわらず共有構造を学習するかどうかは不明です。
このギャップを埋めるために、この作業は、最初に線形 2 層ネットワークアーキテクチャを使用して ANIL が線形共有表現を正常に学習する無限の数のタスクの限界を示しています。
さらに、この結果は、仕様が間違っていても成り立ちます。共有表現の隠れた次元に対して大きな幅を持っていても、アルゴリズムのパフォーマンスは損なわれません。
学習したパラメーターにより、新しいタスクで 1 つの勾配ステップを実行した後、わずかなテスト損失を得ることができます。
全体として、これはモデルにとらわれない方法がどのような (未知の) モデル構造にもうまく適応できることを示しています。

要約(オリジナル)

Due to its empirical success on few shot classification and reinforcement learning, meta-learning recently received a lot of interest. Meta-learning leverages data from previous tasks to quickly learn a new task, despite limited data. In particular, model agnostic methods look for initialisation points from which gradient descent quickly adapts to any new task. Although it has been empirically suggested that such methods learn a good shared representation during training, there is no strong theoretical evidence of such behavior. More importantly, it is unclear whether these methods truly are model agnostic, i.e., whether they still learn a shared structure despite architecture misspecifications. To fill this gap, this work shows in the limit of an infinite number of tasks that first order ANIL with a linear two-layer network architecture successfully learns a linear shared representation. Moreover, this result holds despite misspecifications: having a large width with respect to the hidden dimension of the shared representation does not harm the algorithm performance. The learnt parameters then allow to get a small test loss after a single gradient step on any new task. Overall this illustrates how well model agnostic methods can adapt to any (unknown) model structure.

arxiv情報

著者	Oguz Yuksel,Etienne Boursier,Nicolas Flammarion
発行日	2023-03-02 15:13:37+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Model agnostic methods meta-learn despite misspecifications

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー