Learning to Learn with Generative Models of Neural Network Checkpoints

要約

ニューラルネットワークを最適化するための学習のためのデータ駆動型アプローチを探ります。
ニューラルネットワークチェックポイントのデータセットを構築し、パラメーターで生成モデルをトレーニングします。
特に、私たちのモデルは条件付き拡散変換器であり、初期入力パラメーターベクトルとプロンプトロス、エラー、またはリターンが与えられると、目的のメトリックを達成するパラメーター更新の分布を予測します。
テスト時に、1 回の更新でダウンストリームタスクの目に見えないパラメーターを使用してニューラルネットワークを最適化できます。
私たちのアプローチは、さまざまな損失プロンプトのパラメーターを正常に生成することがわかりました。
さらに、マルチモーダルパラメーターソリューションをサンプリングすることができ、好ましいスケーリング特性を備えています。
この方法を、教師あり学習と強化学習のさまざまなニューラルネットワークアーキテクチャとタスクに適用します。

要約(オリジナル)

We explore a data-driven approach for learning to optimize neural networks. We construct a dataset of neural network checkpoints and train a generative model on the parameters. In particular, our model is a conditional diffusion transformer that, given an initial input parameter vector and a prompted loss, error, or return, predicts the distribution over parameter updates that achieve the desired metric. At test time, it can optimize neural networks with unseen parameters for downstream tasks in just one update. We find that our approach successfully generates parameters for a wide range of loss prompts. Moreover, it can sample multimodal parameter solutions and has favorable scaling properties. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.

arxiv情報

著者	William Peebles,Ilija Radosavovic,Tim Brooks,Alexei A. Efros,Jitendra Malik
発行日	2022-09-26 17:59:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Learning to Learn with Generative Models of Neural Network Checkpoints

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー