Graph Metanetworks for Processing Diverse Neural Architectures

要約

ニューラルネットワークは、学習した情報をパラメータ内で効率的にエンコードします。
したがって、ニューラルネットワーク自体を入力データとして扱うことで、多くのタスクを統合できます。
その際、最近の研究では、パラメータ空間の対称性と形状を考慮することが重要であることが実証されました。
ただし、これらの研究では、正規化層を使用せずに、MLP や CNN などの特定のネットワークに合わせたアーキテクチャが開発されており、そのようなアーキテクチャを他のタイプのネットワークに一般化するのは困難な場合があります。
この研究では、新しいメタネットワーク、つまり他のニューラルネットワークからの重みを入力として受け取るニューラルネットワークを構築することで、これらの課題を克服します。
簡単に言えば、入力ニューラルネットワークを表すグラフを慎重に構築し、グラフニューラルネットワークを使用してグラフを処理します。
私たちのアプローチであるグラフメタネットワーク (GMN) は、マルチヘッドアテンションレイヤー、正規化レイヤー、畳み込みレイヤー、ResNet ブロック、グループ等変線形レイヤーなど、競合する手法が困難なニューラルアーキテクチャに一般化します。
GMN は表現力豊かで、入力ニューラルネットワーク関数を変更しないパラメーター順列対称性と等価であることを証明します。
さまざまなニューラルネットワークアーキテクチャ上のいくつかのメタネットワークタスクでこの方法の有効性を検証します。

要約(オリジナル)

Neural networks efficiently encode learned information within their parameters. Consequently, many tasks can be unified by treating neural networks themselves as input data. When doing so, recent studies demonstrated the importance of accounting for the symmetries and geometry of parameter spaces. However, those works developed architectures tailored to specific networks such as MLPs and CNNs without normalization layers, and generalizing such architectures to other types of networks can be challenging. In this work, we overcome these challenges by building new metanetworks – neural networks that take weights from other neural networks as input. Put simply, we carefully build graphs representing the input neural networks and process the graphs using graph neural networks. Our approach, Graph Metanetworks (GMNs), generalizes to neural architectures where competing methods struggle, such as multi-head attention layers, normalization layers, convolutional layers, ResNet blocks, and group-equivariant linear layers. We prove that GMNs are expressive and equivariant to parameter permutation symmetries that leave the input neural network functions unchanged. We validate the effectiveness of our method on several metanetwork tasks over diverse neural network architectures.

arxiv情報

著者	Derek Lim,Haggai Maron,Marc T. Law,Jonathan Lorraine,James Lucas
発行日	2023-12-07 18:21:52+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Graph Metanetworks for Processing Diverse Neural Architectures

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー