CrAM: A Compression-Aware Minimizer

要約

タイトル：CrAM：圧縮意識のあるミニマイザー
要約：
– DNN（深層ニューラルネットワーク）は、実際の状況で展開される前に剪定や量子化によって圧縮する必要がある。
– CrAMは、最適化ステップを原則的に修正することで、近傍損失行動が剪定などの圧縮操作に対して安定しているモデルを生成する新しい圧縮意識のミニマイザーである。
– CrAMで訓練された密なモデルは、1ステップで重要な精度損失がなく圧縮可能であるべきである。
– ImageNet分類の残留ネットワークや言語モデリングのBERTモデルなどの標準的なベンチマークの実験結果から、CrAMは従来のSGD / Adamベースラインよりも精度が高く、重量剪定に対して安定している密なモデルを生成することができる。具体的には、ほとんど精度損失がなく70-80％まで1ショットでモデルを剪定し、1％程度の精度損失で90％までモデルを剪定し、段階的な圧縮方法と競合する。
– CrAMは、転移学習において好成績を発揮するスパースモデルを生成することができ、GPUハードウェアでサポートされる半構造化2：4剪定パターンにも適用できる。
– 結果を再現するためのコードは、https://github.com/IST-DASLab/CrAM で利用可能。

要約(オリジナル)

Deep neural networks (DNNs) often have to be compressed, via pruning and/or quantization, before they can be deployed in practical settings. In this work we propose a new compression-aware minimizer dubbed CrAM that modifies the optimization step in a principled way, in order to produce models whose local loss behavior is stable under compression operations such as pruning. Thus, dense models trained via CrAM should be compressible post-training, in a single step, without significant accuracy loss. Experimental results on standard benchmarks, such as residual networks for ImageNet classification and BERT models for language modelling, show that CrAM produces dense models that can be more accurate than the standard SGD/Adam-based baselines, but which are stable under weight pruning: specifically, we can prune models in one-shot to 70-80% sparsity with almost no accuracy loss, and to 90% with reasonable ($\sim 1\%$) accuracy loss, which is competitive with gradual compression methods. Additionally, CrAM can produce sparse models which perform well for transfer learning, and it also works for semi-structured 2:4 pruning patterns supported by GPU hardware. The code for reproducing the results is available at https://github.com/IST-DASLab/CrAM .

arxiv情報

著者	Alexandra Peste,Adrian Vladu,Eldar Kurtic,Christoph H. Lampert,Dan Alistarh
発行日	2023-05-04 13:55:21+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

CrAM: A Compression-Aware Minimizer

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー