Introducing Instruction-Accurate Simulators for Performance Estimation of Autotuning Workloads

要約

機械学習（ML）ワークロードの加速には、最適化スペースが大きいため、効率的な方法が必要です。
AutoTuningは、実装のバリエーションを体系的に評価するための効果的なアプローチとして浮上しています。
伝統的に、オートチューンでは、ターゲットハードウェア（HW）でワークロードを実行する必要があります。
シミュレータでオートチューニングワークロードを実行できるインターフェイスを提示します。
このアプローチは、ターゲットHWの可用性が制限されている場合に高いスケーラビリティを提供します。多くのシミュレーションは、アクセス可能なHWで並行して実行できます。
さらに、迅速な命令accurateシミュレータをオートチューニングに使用する可能性を評価します。
シミュレーション統計に基づいて、ターゲットHWでのMLワークロード実装のパフォーマンスを予測するために、さまざまな予測因子をトレーニングします。
私たちの結果は、調整された予測因子が非常に効果的であることを示しています。
ターゲットHWでの実際の実行時間に関する最適なワークロードの実装は、テストされたX86、ARM、およびRISC-Vベースのアーキテクチャの予測の上位3％内に常に含まれています。
最良のケースでは、このアプローチは、3つのシミュレータで並行して3つのサンプルを実行するときに、組み込みアーキテクチャのターゲットHWのネイティブ実行を上回ります。

要約(オリジナル)

Accelerating Machine Learning (ML) workloads requires efficient methods due to their large optimization space. Autotuning has emerged as an effective approach for systematically evaluating variations of implementations. Traditionally, autotuning requires the workloads to be executed on the target hardware (HW). We present an interface that allows executing autotuning workloads on simulators. This approach offers high scalability when the availability of the target HW is limited, as many simulations can be run in parallel on any accessible HW. Additionally, we evaluate the feasibility of using fast instruction-accurate simulators for autotuning. We train various predictors to forecast the performance of ML workload implementations on the target HW based on simulation statistics. Our results demonstrate that the tuned predictors are highly effective. The best workload implementation in terms of actual run time on the target HW is always within the top 3 % of predictions for the tested x86, ARM, and RISC-V-based architectures. In the best case, this approach outperforms native execution on the target HW for embedded architectures when running as few as three samples on three simulators in parallel.

arxiv情報

著者	Rebecca Pelke,Nils Bosbach,Lennart M. Reimann,Rainer Leupers
発行日	2025-05-19 16:59:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Introducing Instruction-Accurate Simulators for Performance Estimation of Autotuning Workloads

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー