Marsellus: A Heterogeneous RISC-V AI-IoT End-Node SoC with 2-to-8b DNN Acceleration and 30%-Boost Adaptive Body Biasing

要約

拡張現実、パーソナライズされたヘルスケア、ナノロボティクス向けの新興の人工知能対応モノのインターネット (AI-IoT) システムオンチップ (SoC) は、数十の電力エンベロープ内で多くの多様なタスクを実行する必要があります。
幅広い動作条件で mW の性能を発揮します。つまり、計算量が多いが強力に量子化されたディープニューラルネットワーク (DNN) 推論、および高精度浮動小数点を必要とする信号処理と制御です。
GlobalFoundries 22nm FDX で製造された AI-IoT エンドノード用のオールデジタルヘテロジニアス SoC、Marsellus を紹介します。これは、1) 多様な処理の実行に合わせて調整された 16 個の RISC-V デジタル信号処理 (DSP) コアの汎用クラスターを組み合わせています。
4 ビットおよび 2 ビットの算術拡張機能 (XpulpNN) を利用し、MAC&LOAD 演算と浮動小数点サポートの融合を組み合わせた一連のワークロード。
2) DNN での 3×3 および 1×1 (ポイントごとの) 畳み込みを高速化する 2 ～ 8 ビットの再構成可能バイナリエンジン (RBE)。
3) アダプティブボディバイアス (ABB) ジェネレーターとハードウェア制御ループに接続された一連のオンチップモニタリング (OCM) ブロック。これにより、トランジスタのしきい値電圧のオンザフライ適応が可能になります。
Marsellus は、ソフトウェアの 2 ビット精度演算で最大 180 Gop/s または 3.32 Top/s/W、ハードウェアアクセラレーションの DNN レイヤーで最大 637 Gop/s または 12.4 Top/s/W を達成します。

要約(オリジナル)

Emerging Artificial Intelligence-enabled Internet-of-Things (AI-IoT) System-on-a-Chip (SoC) for augmented reality, personalized healthcare, and nano-robotics need to run many diverse tasks within a power envelope of a few tens of mW over a wide range of operating conditions: compute-intensive but strongly quantized Deep Neural Network (DNN) inference, as well as signal processing and control requiring high-precision floating-point. We present Marsellus, an all-digital heterogeneous SoC for AI-IoT end-nodes fabricated in GlobalFoundries 22nm FDX that combines 1) a general-purpose cluster of 16 RISC-V Digital Signal Processing (DSP) cores attuned for the execution of a diverse range of workloads exploiting 4-bit and 2-bit arithmetic extensions (XpulpNN), combined with fused MAC&LOAD operations and floating-point support; 2) a 2-8bit Reconfigurable Binary Engine (RBE) to accelerate 3×3 and 1×1 (pointwise) convolutions in DNNs; 3) a set of On-Chip Monitoring (OCM) blocks connected to an Adaptive Body Biasing (ABB) generator and a hardware control loop, enabling on-the-fly adaptation of transistor threshold voltages. Marsellus achieves up to 180 Gop/s or 3.32 Top/s/W on 2-bit precision arithmetic in software, and up to 637 Gop/s or 12.4 Top/s/W on hardware-accelerated DNN layers.

arxiv情報

著者	Francesco Conti,Gianna Paulin,Angelo Garofalo,Davide Rossi,Alfio Di Mauro,Georg Rutishauser,Gianmarco Ottavi,Manuel Eggimann,Hayate Okuhara,Luca Benini
発行日	2023-11-28 15:36:11+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Marsellus: A Heterogeneous RISC-V AI-IoT End-Node SoC with 2-to-8b DNN Acceleration and 30%-Boost Adaptive Body Biasing

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー