Fast as CHITA: Neural Network Pruning with Combinatorial Optimization

要約

最新のニューラルネットワークは規模が非常に大きいため、モデルの処理は深刻な計算上の課題になります。
一般的なクラスの圧縮手法は、事前トレーニング済みネットワークの重みをプルーニングまたはスパース化することで、この課題を克服します。
これらの手法は有用ですが、多くの場合、計算要件と圧縮品質の間で深刻なトレードオフに直面します。
この作業では、スパース性制約の対象となる複数の重みの枝刈り (および更新) の複合効果を考慮する、新しい最適化ベースの枝刈りフレームワークを提案します。
私たちのアプローチである CHITA は、従来の Optimal Brain Surgeon フレームワークを拡張し、ネットワークプルーニングの既存の最適化ベースのアプローチよりも速度、メモリ、およびパフォーマンスを大幅に改善します。
CHITA の主力製品は、損失関数のローカル 2 次近似のメモリにやさしい表現で組み合わせ最適化の更新を実行します。
事前トレーニング済みのモデルとデータセットの標準的なベンチマークでは、CHITA は競合する方法よりも大幅に優れたスパース性と精度のトレードオフをもたらします。
たとえば、重みの 2% しか保持されていない MLPNet の場合、私たちのアプローチは最新技術と比較して精度を 63% 向上させます。
さらに、SGD ステップの微調整と組み合わせて使用すると、この方法は最先端のアプローチよりも大幅な精度向上を実現します。

要約(オリジナル)

The sheer size of modern neural networks makes model serving a serious computational challenge. A popular class of compression techniques overcomes this challenge by pruning or sparsifying the weights of pretrained networks. While useful, these techniques often face serious tradeoffs between computational requirements and compression quality. In this work, we propose a novel optimization-based pruning framework that considers the combined effect of pruning (and updating) multiple weights subject to a sparsity constraint. Our approach, CHITA, extends the classical Optimal Brain Surgeon framework and results in significant improvements in speed, memory, and performance over existing optimization-based approaches for network pruning. CHITA’s main workhorse performs combinatorial optimization updates on a memory-friendly representation of local quadratic approximation(s) of the loss function. On a standard benchmark of pretrained models and datasets, CHITA leads to significantly better sparsity-accuracy tradeoffs than competing methods. For example, for MLPNet with only 2% of the weights retained, our approach improves the accuracy by 63% relative to the state of the art. Furthermore, when used in conjunction with fine-tuning SGD steps, our method achieves significant accuracy gains over the state-of-the-art approaches.

arxiv情報

著者	Riade Benbaki,Wenyu Chen,Xiang Meng,Hussein Hazimeh,Natalia Ponomareva,Zhe Zhao,Rahul Mazumder
発行日	2023-02-28 15:03:18+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Fast as CHITA: Neural Network Pruning with Combinatorial Optimization

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー