Reusing Softmax Hardware Unit for GELU Computation in Transformers

要約

Transformers により、自然言語処理 (NLP) およびコンピュータービジョンアプリケーションのパフォーマンスが大幅に向上しました。
変換器の計算には、行列の乗算と、ハードウェアで直接加速されるソフトマックスや GELU (ガウス誤差線形単位) などの非線形活性化関数が含まれます。
現在、関数の評価は関数ごとに個別に行われており、ハードウェアの再利用はほとんど許可されていません。
この問題を軽減するために、この作業では、GELU の計算をソフトマックス演算子にマップします。
このようにして、softmax 用にすでに設計されている効率的なハードウェアユニットを GELU の計算にも再利用できます。
GELU の計算では、softmax の固有のベクトル化の性質を活用し、複数の GELU 結果を並行して生成できます。
実験結果は、既存の段階的に変更されたソフトマックスハードウェアユニットを介して GELU を計算すると、(a) 代表的な NLP アプリケーションの精度が低下せず、(b) 全体のハードウェア面積と電力を 6.1% および 11.9% 削減できることを示しています。
それぞれ平均して。

要約(オリジナル)

Transformers have improved drastically the performance of natural language processing (NLP) and computer vision applications. The computation of transformers involves matrix multiplications and non-linear activation functions such as softmax and GELU (Gaussion Error Linear Unit) that are accelerated directly in hardware. Currently, function evaluation is done separately for each function and rarely allows for hardware reuse. To mitigate this problem, in this work, we map the computation of GELU to a softmax operator. In this way, the efficient hardware units designed already for softmax can be reused for computing GELU as well. Computation of GELU can enjoy the inherent vectorized nature of softmax and produce in parallel multiple GELU outcomes. Experimental results show that computing GELU via a pre-existing and incrementally modified softmax hardware unit (a) does not reduce the accuracy of representative NLP applications and (b) allows the reduction of the overall hardware area and power by 6.1% and 11.9%, respectively, on average.

arxiv情報

著者	Christodoulos Peltekis,Kosmas Alexandridis,Giorgos Dimitrakopoulos
発行日	2024-02-16 08:52:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Reusing Softmax Hardware Unit for GELU Computation in Transformers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー