Auto-Prox: Training-Free Vision Transformer Architecture Search via Automatic Proxy Discovery

要約

コンピュータービジョンタスクにおける Vision Transformer (ViT) の大きな成功は、主にアーキテクチャ設計によるものです。
これは、より優れた ViT を自動的に設計するための効率的なアーキテクチャ検索の必要性を強調しています。
トレーニングベースのアーキテクチャ検索方法は計算量が多いため、ゼロコストのプロキシを使用して ViT をスコアリングするトレーニング不要の方法への関心が高まっています。
ただし、既存のトレーニング不要のアプローチでは、特定のゼロコストプロキシを手動で設計するには専門知識が必要です。
さらに、これらのゼロコストプロキシは、さまざまなドメインにわたって一般化するには限界があります。
このペーパーでは、この問題に対処するために、自動プロキシ検出フレームワークである Auto-Prox を紹介します。
まず、さまざまな ViT 候補と複数のデータセットでの実際のパフォーマンスを含む ViT-Bench-101 を構築します。
ViT-Bench-101 を利用すると、スコア精度の相関関係に基づいてゼロコストプロキシを評価できます。
次に、計算グラフを使用してゼロコストプロキシを表し、ViT 統計とプリミティブ操作を使用してゼロコストプロキシ検索空間を編成します。
一般的なゼロコストプロキシを発見するために、さまざまなゼロコストプロキシ候補を進化させ、変更するための共同相関メトリックを提案します。
搾取と探索の間のより良いトレードオフを達成するために、検索効率を高めるためにエリート主義を維持する戦略を導入します。
発見されたゼロコストプロキシに基づいて、トレーニング不要の方法で ViT アーキテクチャの検索を実行します。
広範な実験により、私たちの方法がさまざまなデータセットによく一般化され、ランキング相関と最終精度の両方において最先端の結果が得られることが実証されました。
コードは https://github.com/lilujunai/Auto-Prox-AAAI24 で見つけることができます。

要約(オリジナル)

The substantial success of Vision Transformer (ViT) in computer vision tasks is largely attributed to the architecture design. This underscores the necessity of efficient architecture search for designing better ViTs automatically. As training-based architecture search methods are computationally intensive, there is a growing interest in training-free methods that use zero-cost proxies to score ViTs. However, existing training-free approaches require expert knowledge to manually design specific zero-cost proxies. Moreover, these zero-cost proxies exhibit limitations to generalize across diverse domains. In this paper, we introduce Auto-Prox, an automatic proxy discovery framework, to address the problem. First, we build the ViT-Bench-101, which involves different ViT candidates and their actual performance on multiple datasets. Utilizing ViT-Bench-101, we can evaluate zero-cost proxies based on their score-accuracy correlation. Then, we represent zero-cost proxies with computation graphs and organize the zero-cost proxy search space with ViT statistics and primitive operations. To discover generic zero-cost proxies, we propose a joint correlation metric to evolve and mutate different zero-cost proxy candidates. We introduce an elitism-preserve strategy for search efficiency to achieve a better trade-off between exploitation and exploration. Based on the discovered zero-cost proxy, we conduct a ViT architecture search in a training-free manner. Extensive experiments demonstrate that our method generalizes well to different datasets and achieves state-of-the-art results both in ranking correlation and final accuracy. Codes can be found at https://github.com/lilujunai/Auto-Prox-AAAI24.

arxiv情報

著者	Zimian Wei,Lujun Li,Peijie Dong,Zheng Hui,Anggeng Li,Menglong Lu,Hengyue Pan,Zhiliang Tian,Dongsheng Li
発行日	2023-12-14 15:55:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Auto-Prox: Training-Free Vision Transformer Architecture Search via Automatic Proxy Discovery

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー