PolyLUT: Learning Piecewise Polynomials for Ultra-Low Latency FPGA LUT-based Inference

要約

フィールドプログラマブルゲートアレイ (FPGA) は、深層学習推論の実装に広く使用されています。
標準のディープニューラルネットワーク推論には、インターリーブされた線形マップと非線形活性化関数の計算が含まれます。
超低レイテンシの実装に関するこれまでの研究では、FPGA ルックアップテーブル (LUT) 内で線形マップと非線形アクティベーションの組み合わせをハードコーディングしていました。
私たちの研究は、FPGA の LUT を使用してこれよりもはるかに多様な機能を実装できるという考えによって動機付けられています。
この論文では、多変量多項式を基本構成要素として使用して、FPGA 展開用のニューラルネットワークをトレーニングする新しいアプローチを提案します。
私たちの方法では、ソフトロジックによってもたらされる柔軟性を利用し、最小限のオーバーヘッドで LUT 内に多項式評価を隠します。
多項式ビルディングブロックを使用すると、線形関数を使用する場合よりもかなり少ないソフトロジック層で同じ精度を達成でき、レイテンシーと面積が大幅に改善されることを示します。
私たちは、ネットワーク侵入検知、CERN 大型ハドロン衝突型加速器でのジェット識別、MNIST データセットを使用した手書き数字認識の 3 つのタスクでこのアプローチの有効性を実証します。

要約(オリジナル)

Field-programmable gate arrays (FPGAs) are widely used to implement deep learning inference. Standard deep neural network inference involves the computation of interleaved linear maps and nonlinear activation functions. Prior work for ultra-low latency implementations has hardcoded the combination of linear maps and nonlinear activations inside FPGA lookup tables (LUTs). Our work is motivated by the idea that the LUTs in an FPGA can be used to implement a much greater variety of functions than this. In this paper, we propose a novel approach to training neural networks for FPGA deployment using multivariate polynomials as the basic building block. Our method takes advantage of the flexibility offered by the soft logic, hiding the polynomial evaluation inside the LUTs with minimal overhead. We show that by using polynomial building blocks, we can achieve the same accuracy using considerably fewer layers of soft logic than by using linear functions, leading to significant latency and area improvements. We demonstrate the effectiveness of this approach in three tasks: network intrusion detection, jet identification at the CERN Large Hadron Collider, and handwritten digit recognition using the MNIST dataset.

arxiv情報

著者	Marta Andronic,George A. Constantinides
発行日	2023-11-06 17:28:51+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

PolyLUT: Learning Piecewise Polynomials for Ultra-Low Latency FPGA LUT-based Inference

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー