It’s all about PR — Smart Benchmarking AI Accelerators using Performance Representatives

要約

統計モデルは、商用既製 (COTS) AI ハードウェアアクセラレータのパフォーマンスを推定するために広く使用されています。
ただし、統計パフォーマンスモデルのトレーニングには大量のデータが必要になることが多く、多大な時間の投資が必要となり、ハードウェアの可用性が限られている場合には困難になる可能性があります。
この問題を軽減するために、良好な精度を維持しながらトレーニングサンプルの数を大幅に削減する新しいパフォーマンスモデリング方法論を提案します。
私たちのアプローチでは、ターゲットハードウェアアーキテクチャの知識と初期パラメータースイープを活用して、ディープニューラルネットワーク (DNN) レイヤーの一連のパフォーマンス代表 (PR) を特定します。
これらの PR は、ベンチマーク、統計的パフォーマンスモデルの構築、および推定に使用されます。
この的を絞ったアプローチでは、ランダムサンプリングとは対照的に、必要なトレーニングサンプルの数が大幅に削減され、より良い推定精度が実現されます。
単一層推定では 0.02% という低い平均絶対パーセント誤差 (MAPE) を達成し、トレーニングサンプルが 10,000 未満の DNN 推定全体では 0.68% という低い値を達成しました。
この結果は、同じサイズのランダムにサンプリングされたデータセットでトレーニングされたモデルと比較して、単層推定における私たちの方法の優位性を示しています。

要約(オリジナル)

Statistical models are widely used to estimate the performance of commercial off-the-shelf (COTS) AI hardware accelerators. However, training of statistical performance models often requires vast amounts of data, leading to a significant time investment and can be difficult in case of limited hardware availability. To alleviate this problem, we propose a novel performance modeling methodology that significantly reduces the number of training samples while maintaining good accuracy. Our approach leverages knowledge of the target hardware architecture and initial parameter sweeps to identify a set of Performance Representatives (PR) for deep neural network (DNN) layers. These PRs are then used for benchmarking, building a statistical performance model, and making estimations. This targeted approach drastically reduces the number of training samples needed, opposed to random sampling, to achieve a better estimation accuracy. We achieve a Mean Absolute Percentage Error (MAPE) of as low as 0.02% for single-layer estimations and 0.68% for whole DNN estimations with less than 10000 training samples. The results demonstrate the superiority of our method for single-layer estimations compared to models trained with randomly sampled datasets of the same size.

arxiv情報

著者	Alexander Louis-Ferdinand Jung,Jannik Steinmetz,Jonathan Gietz,Konstantin Lübeck,Oliver Bringmann
発行日	2024-06-12 15:34:28+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

It’s all about PR — Smart Benchmarking AI Accelerators using Performance Representatives

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー