Set-based Neural Network Encoding Without Weight Tying

要約

我々は、set-to-set関数とset-to-vector関数を利用してニューラルネットワークパラメータを効率的に符号化する、ネットワーク特性予測のためのニューラルネットワーク重み符号化方法を提案します。
さまざまなアーキテクチャのカスタムエンコードモデルを必要とする以前のアプローチとは対照的に、私たちのアプローチは、混合アーキテクチャとさまざまなパラメータサイズのモデル動物園でニューラルネットワークをエンコードできます。
さらに、\textbf{S}et ベースの \textbf{N} ユーロラルネットワーク \textbf{E}ncoder (SNE) は、ニューラルネットワークの階層的な計算構造を考慮しています。
ネットワークの重み空間に固有の対称性を尊重するために、ロジット不変性を利用して、必要な最小の不変性プロパティを学習します。
さらに、計算およびメモリの制約に合わせて調整可能なニューラルネットワーク層を効率的にエンコードする \textit{pad-chunk-encode} パイプラインを導入します。
また、ニューラルネットワークのプロパティ予測のための 2 つの新しいタスク (クロスデータセットとクロスアーキテクチャ) も導入します。
クロスデータセットの特性予測では、異なるデータセットでトレーニングされた同じアーキテクチャのモデル動物園全体で特性予測子がどの程度うまく一般化できるかを評価します。
クロスアーキテクチャのプロパティ予測では、トレーニング中には見られなかった異なるアーキテクチャのモデル動物園にプロパティ予測子がどの程度うまく移行するかを評価します。
SNE が標準ベンチマークの関連ベースラインを上回るパフォーマンスを示していることがわかります。

要約(オリジナル)

We propose a neural network weight encoding method for network property prediction that utilizes set-to-set and set-to-vector functions to efficiently encode neural network parameters. Our approach is capable of encoding neural networks in a model zoo of mixed architecture and different parameter sizes as opposed to previous approaches that require custom encoding models for different architectures. Furthermore, our \textbf{S}et-based \textbf{N}eural network \textbf{E}ncoder (SNE) takes into consideration the hierarchical computational structure of neural networks. To respect symmetries inherent in network weight space, we utilize Logit Invariance to learn the required minimal invariance properties. Additionally, we introduce a \textit{pad-chunk-encode} pipeline to efficiently encode neural network layers that is adjustable to computational and memory constraints. We also introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture. In cross-dataset property prediction, we evaluate how well property predictors generalize across model zoos trained on different datasets but of the same architecture. In cross-architecture property prediction, we evaluate how well property predictors transfer to model zoos of different architecture not seen during training. We show that SNE outperforms the relevant baselines on standard benchmarks.

arxiv情報

著者	Bruno Andreis,Soro Bedionita,Philip H. S. Torr,Sung Ju Hwang
発行日	2025-01-14 13:48:49+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Set-based Neural Network Encoding Without Weight Tying

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー