Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

要約

アナログインメモリコンピューティング (AIMC) — ディープラーニングワークロードのエネルギー効率の高い高速化のための有望なアプローチ — は行列ベクトル乗算 (MVM) を計算しますが、多くの場合、非決定論的または非線形である非理想性のために、近似にすぎません。
これは、従来の浮動小数点 (FP) 実装と比較して、達成可能なディープニューラルネットワーク (DNN) の推論精度に悪影響を与える可能性があります。
堅牢性を向上させるために再トレーニングが以前に提案されていましたが、以前の研究では、異種で過度に単純化された AIMC ハードウェアモデルを使用して、少数の DNN トポロジのみを調査しました。
ここでは、ハードウェア認識 (HWA) トレーニングを使用して、複数の DNN トポロジにまたがる複数の一般的な人工知能 (AI) ワークロードに対する AIMC の精度を体系的に調べ、広範な一連の非理想性に対する感度と堅牢性を調査します。
新しい非常に現実的な AIMC クロスバーモデルを導入することで、以前の再トレーニングアプローチを大幅に改善します。
畳み込みニューラルネットワーク (CNN)、リカレントニューラルネットワーク (RNN)、トランスフォーマーなど、さまざまなトポロジの多くの大規模 DNN を実際に再トレーニングして、AIMC で等精度を示すことができることを示します。
私たちの結果はさらに、重みではなく入力または出力にノイズを追加する AIMC 非理想性が DNN の精度に最大の影響を与え、RNN はすべての非理想性に対して特にロバストであることを示唆しています。

要約(オリジナル)

Analog in-memory computing (AIMC) — a promising approach for energy-efficient acceleration of deep learning workloads — computes matrix-vector multiplications (MVMs) but only approximately, due to nonidealities that often are non-deterministic or nonlinear. This can adversely impact the achievable deep neural network (DNN) inference accuracy as compared to a conventional floating point (FP) implementation. While retraining has previously been suggested to improve robustness, prior work has explored only a few DNN topologies, using disparate and overly simplified AIMC hardware models. Here, we use hardware-aware (HWA) training to systematically examine the accuracy of AIMC for multiple common artificial intelligence (AI) workloads across multiple DNN topologies, and investigate sensitivity and robustness to a broad set of nonidealities. By introducing a new and highly realistic AIMC crossbar-model, we improve significantly on earlier retraining approaches. We show that many large-scale DNNs of various topologies, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformers, can in fact be successfully retrained to show iso-accuracy on AIMC. Our results further suggest that AIMC nonidealities that add noise to the inputs or outputs, not the weights, have the largest impact on DNN accuracy, and that RNNs are particularly robust to all nonidealities.

arxiv情報

著者	Malte J. Rasch,Charles Mackin,Manuel Le Gallo,An Chen,Andrea Fasoli,Frederic Odermatt,Ning Li,S. R. Nandakumar,Pritish Narayanan,Hsinyu Tsai,Geoffrey W. Burr,Abu Sebastian,Vijay Narayanan
発行日	2023-02-16 18:25:06+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー