Abstracting Sparse DNN Acceleration via Structured Sparse Tensor Decomposition

要約

ディープニューラルネットワーク (DNN) でのスパース性の活用は、最新の DNN の増大する計算ニーズを満たす有望な分野です。
ただし、実際には、スパース DNN の高速化は依然として重要な課題に直面しています。
スパースアクセラレーションのオーバーヘッドを最小限に抑えるために、ハードウェア設計者は最近、構造化されたスパースハードウェアサポートを提案しましたが、これでは柔軟性が限られており、追加のモデル微調整が必要になります。
さらに、特定の構造化されたスパースハードウェア用に微調整されたスパースモデルは、他の構造化されたハードウェアによって高速化することはできません。
スパース DNN モデルとハードウェア間のギャップを埋めるために、この論文では、構造化分解によるテンソル近似 (TASD) を提案します。これは、線形代数の分配特性を利用して、スパーステンソルを一連の構造化されたスパーステンソルに変換します。
次に、重みテンソルと活性化テンソルの両方について層ごとの高品質な構造化分解を検索することで DNN を高速化するソフトウェアフレームワーク TASDER を開発します。これにより、構造化されたスパースハードウェアサポートを備えたシステムで DNN を高速化できます。
評価結果は、以前の構造化されたスパースハードウェアベースラインを活用することで、私たちの方法が微調整なしで既製のデンスおよびスパース DNN を高速化し、エネルギー遅延積を平均で最大 83% および 74% 改善できることを示しています。

要約(オリジナル)

Exploiting sparsity in deep neural networks (DNNs) has been a promising area to meet the growing computation need of modern DNNs. However, in practice, sparse DNN acceleration still faces a key challenge. To minimize the overhead of sparse acceleration, hardware designers have proposed structured sparse hardware support recently, which provides limited flexibility and requires extra model fine-tuning. Moreover, any sparse model fine-tuned for certain structured sparse hardware cannot be accelerated by other structured hardware. To bridge the gap between sparse DNN models and hardware, this paper proposes tensor approximation via structured decomposition (TASD), which leverages the distributive property in linear algebra to turn any sparse tensor into a series of structured sparse tensors. Next, we develop a software framework, TASDER, to accelerate DNNs by searching layer-wise, high-quality structured decomposition for both weight and activation tensors so that they can be accelerated by any systems with structured sparse hardware support. Evaluation results show that, by exploiting prior structured sparse hardware baselines, our method can accelerate off-the-shelf dense and sparse DNNs without fine-tuning and improves energy-delay-product by up to 83% and 74% on average.

arxiv情報

著者	Geonhwa Jeong,Po-An Tsai,Abhimanyu R. Bambhaniya,Stephen W. Keckler,Tushar Krishna
発行日	2024-03-31 23:47:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Abstracting Sparse DNN Acceleration via Structured Sparse Tensor Decomposition

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー