Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling

要約

近年、スパーステンソル代数アプリケーション (スパースニューラルネットワークなど) を効率的に処理するために、多くのアクセラレータが提案されています。
ただし、これらの提案は、大きくて多様なデザインスペース内の単一のポイントです。
これらのスパーステンソルアクセラレータの体系的な記述とモデリングサポートの欠如は、ハードウェア設計者が効率的かつ効果的な設計空間の調査を妨げています。
この論文ではまず、多様なスパーステンソルアクセラレータの設計空間を体系的に記述するための統一された分類法を提示します。
提案された分類法に基づいて、スパーステンソルアクセラレータの初期段階の評価と調査を可能にする、最初の高速で正確かつ柔軟な分析モデリングフレームワークである Sparseloop を紹介します。
Sparseloop は、さまざまなデータフローやスパースアクセラレーション機能 (ゼロベースコンピューティングの排除など) を含む、アーキテクチャ仕様の大規模なセットを理解しています。
これらの仕様を使用して、Sparseloop は設計の処理速度とエネルギー効率を評価し、採用されたデータフローによって発生するデータ移動と計算、および確率テンソル密度モデルを使用したスパースアクセラレーション機能によって導入される節約とオーバーヘッドを考慮します。
代表的なアクセラレータとワークロード全体で、Sparseloop はサイクルレベルのシミュレーションよりも 2000 倍以上速いモデリング速度を達成し、相対的なパフォーマンストレンドを維持し、0.1% から 8% の平均エラーを達成します。
ケーススタディを使用して、スパーステンソルアクセラレータを設計するための重要な洞察を明らかにするのに役立つ Sparseloop の機能を示します (たとえば、直交設計の側面を共同設計することが重要です)。

要約(オリジナル)

In recent years, many accelerators have been proposed to efficiently process sparse tensor algebra applications (e.g., sparse neural networks). However, these proposals are single points in a large and diverse design space. The lack of systematic description and modeling support for these sparse tensor accelerators impedes hardware designers from efficient and effective design space exploration. This paper first presents a unified taxonomy to systematically describe the diverse sparse tensor accelerator design space. Based on the proposed taxonomy, it then introduces Sparseloop, the first fast, accurate, and flexible analytical modeling framework to enable early-stage evaluation and exploration of sparse tensor accelerators. Sparseloop comprehends a large set of architecture specifications, including various dataflows and sparse acceleration features (e.g., elimination of zero-based compute). Using these specifications, Sparseloop evaluates a design’s processing speed and energy efficiency while accounting for data movement and compute incurred by the employed dataflow as well as the savings and overhead introduced by the sparse acceleration features using stochastic tensor density models. Across representative accelerators and workloads, Sparseloop achieves over 2000 times faster modeling speed than cycle-level simulations, maintains relative performance trends, and achieves 0.1% to 8% average error. With a case study, we demonstrate Sparseloop’s ability to help reveal important insights for designing sparse tensor accelerators (e.g., it is important to co-design orthogonal design aspects).

arxiv情報

著者	Yannan Nellie Wu,Po-An Tsai,Angshuman Parashar,Vivienne Sze,Joel S. Emer
発行日	2022-09-15 17:47:13+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー