Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning

要約

構造化された枝刈りと量子化は、ニューラルネットワークの推論時間とメモリフットプリントを削減するための有望なアプローチです。
ただし、既存のメソッドのほとんどでは、モデルを微調整するために元のトレーニングデータセットが必要です。
これはリソースの大量消費をもたらすだけでなく、機密データや専有データを含むアプリケーションではプライバシーとセキュリティ上の懸念から不可能です。
したがって、この問題に対処するためにいくつかのデータフリーの方法が提案されていますが、それらはデータフリーの枝刈りと量子化を別々に実行するため、枝刈りと量子化の相補性は検討されていません。
この論文では、データや微調整プロセスを行わずに枝刈りと量子化を同時に実行する、Unified Data-Free Compression(UDFC)と呼ばれる新しいフレームワークを提案します。
具体的には、UDFC は、損傷した (例: プルーニングまたは量子化された) チャネルの部分情報が他のチャネルの線形結合によって保存できるという仮定から始まり、次に圧縮による情報損失を復元するための仮定に基づいて再構成形式を導き出します。
。
最後に、元のネットワークとその圧縮されたネットワーク間の再構築誤差を定式化し、理論的に閉形式の解を導き出します。
私たちは大規模な画像分類タスクで UDFC を評価し、さまざまなネットワークアーキテクチャや圧縮方法に比べて大幅な改善が得られました。
たとえば、ResNet-34 で 30% の枝刈り率と 6 ビット量子化を使用した SOTA メソッドと比較して、ImageNet データセットでは 20.54% の精度向上を達成しました。

要約(オリジナル)

Structured pruning and quantization are promising approaches for reducing the inference time and memory footprint of neural networks. However, most existing methods require the original training dataset to fine-tune the model. This not only brings heavy resource consumption but also is not possible for applications with sensitive or proprietary data due to privacy and security concerns. Therefore, a few data-free methods are proposed to address this problem, but they perform data-free pruning and quantization separately, which does not explore the complementarity of pruning and quantization. In this paper, we propose a novel framework named Unified Data-Free Compression(UDFC), which performs pruning and quantization simultaneously without any data and fine-tuning process. Specifically, UDFC starts with the assumption that the partial information of a damaged(e.g., pruned or quantized) channel can be preserved by a linear combination of other channels, and then derives the reconstruction form from the assumption to restore the information loss due to compression. Finally, we formulate the reconstruction error between the original network and its compressed network, and theoretically deduce the closed-form solution. We evaluate the UDFC on the large-scale image classification task and obtain significant improvements over various network architectures and compression methods. For example, we achieve a 20.54% accuracy improvement on ImageNet dataset compared to SOTA method with 30% pruning ratio and 6-bit quantization on ResNet-34.

arxiv情報

著者	Shipeng Bai,Jun Chen,Xintian Shen,Yixuan Qian,Yong Liu
発行日	2023-08-14 15:25:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー