ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients

要約

ニューラルアーキテクチャ検索 (NAS) は、多数の候補アーキテクチャの中から最適なパフォーマンスのニューラルネットワークを自動的に取得するために広く使用されています。
検索時間を短縮するために、ゼロショット NAS は、特定のアーキテクチャのテストパフォーマンスを予測できるトレーニング不要のプロキシを設計することを目指しています。
ただし、最近示されたように、これまでに提案されたゼロショットプロキシのどれも、ネットワークパラメータ (#Params) の数であるナイーブプロキシより一貫して優れた機能を実際に発揮することはできません。
この状況を改善するために、主な理論的貢献として、さまざまなサンプルにわたる特定の勾配特性がニューラルネットワークの収束率と一般化能力にどのように影響するかを最初に明らかにします。
この理論的分析に基づいて、新しいゼロショットプロキシ ZiCo を提案します。これは、#Params より一貫して優れた機能を発揮する最初のプロキシです。
ZiCo は、複数のアプリケーション (例: 画像の分類/再構成およびピクセル) に対して、いくつかの一般的な NAS ベンチマーク (NASBench101、NATSBench-SSS/TSS、TransNASBench-101) で最先端 (SOTA) プロキシよりも優れた機能を発揮することを示しています。
レベル予測）。
最後に、ZiCo を介して見つかった最適なアーキテクチャは、ワンショットおよびマルチショットの NAS 方法で見つかったものと同じくらい競争力があるが、検索時間がはるかに短いことを示しています。
たとえば、ZiCo ベースの NAS は、ImageNet で 0.4 GPU 日以内に、それぞれ 450M、600M、および 1000M FLOP の推論予算の下で、78.1%、79.4%、および 80.4% のテスト精度で最適なアーキテクチャを見つけることができます。
コードは https://github.com/SLDGroup/ZiCo で入手できます。

要約(オリジナル)

Neural Architecture Search (NAS) is widely used to automatically obtain the neural network with the best performance among a large number of candidate architectures. To reduce the search time, zero-shot NAS aims at designing training-free proxies that can predict the test performance of a given architecture. However, as shown recently, none of the zero-shot proxies proposed to date can actually work consistently better than a naive proxy, namely, the number of network parameters (#Params). To improve this state of affairs, as the main theoretical contribution, we first reveal how some specific gradient properties across different samples impact the convergence rate and generalization capacity of neural networks. Based on this theoretical analysis, we propose a new zero-shot proxy, ZiCo, the first proxy that works consistently better than #Params. We demonstrate that ZiCo works better than State-Of-The-Art (SOTA) proxies on several popular NAS-Benchmarks (NASBench101, NATSBench-SSS/TSS, TransNASBench-101) for multiple applications (e.g., image classification/reconstruction and pixel-level prediction). Finally, we demonstrate that the optimal architectures found via ZiCo are as competitive as the ones found by one-shot and multi-shot NAS methods, but with much less search time. For example, ZiCo-based NAS can find optimal architectures with 78.1%, 79.4%, and 80.4% test accuracy under inference budgets of 450M, 600M, and 1000M FLOPs, respectively, on ImageNet within 0.4 GPU days. Our code is available at https://github.com/SLDGroup/ZiCo.

arxiv情報

著者	Guihong Li,Yuedong Yang,Kartikeya Bhardwaj,Radu Marculescu
発行日	2023-03-01 16:13:33+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー