A 65 nm Bayesian Neural Network Accelerator with 360 fJ/Sample In-Word GRNG for AI Uncertainty Estimation

要約

不確実性の推定は、AI 対応の安全性が重要なアプリケーションにとって不可欠な機能です。
自動運転車や医療診断。
ベイジアンニューラルネットワーク (BNN) は、ベイジアン統計を使用して分類予測と不確実性推定の両方を提供しますが、乱数の生成とサンプルの反復に伴う高い計算オーバーヘッドに悩まされます。
さらに、BNN は、各 RNG 操作後に頻繁にメモリへの書き込みが必要となるため、メモリ内計算アーキテクチャによる高速化にすぐには対応できません。
これらの課題に対処するために、360 fJ/サンプルガウス RNG を SRAM メモリワードに直接統合する ASIC を紹介します。
この統合により、RNG のオーバーヘッドが削減され、BNN の完全並列コンピューティングインメモリ操作が可能になります。
プロトタイプのチップは、0.45 mm2 を占有しながら 5.12 GSa/秒の RNG スループットと 102 GOp/秒のニューラルネットワークスループットを達成し、AI の不確実性推定をエッジコンピューティングにもたらします。

要約(オリジナル)

Uncertainty estimation is an indispensable capability for AI-enabled, safety-critical applications, e.g. autonomous vehicles or medical diagnosis. Bayesian neural networks (BNNs) use Bayesian statistics to provide both classification predictions and uncertainty estimation, but they suffer from high computational overhead associated with random number generation and repeated sample iterations. Furthermore, BNNs are not immediately amenable to acceleration through compute-in-memory architectures due to the frequent memory writes necessary after each RNG operation. To address these challenges, we present an ASIC that integrates 360 fJ/Sample Gaussian RNG directly into the SRAM memory words. This integration reduces RNG overhead and enables fully-parallel compute-in-memory operations for BNNs. The prototype chip achieves 5.12 GSa/s RNG throughput and 102 GOp/s neural network throughput while occupying 0.45 mm2, bringing AI uncertainty estimation to edge computation.

arxiv情報

著者	Zephan M. Enciso,Boyang Cheng,Likai Pei,Jianbo Liu,Steven Davis,Ningyuan Cao,Michael Niemier
発行日	2025-01-08 15:47:04+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A 65 nm Bayesian Neural Network Accelerator with 360 fJ/Sample In-Word GRNG for AI Uncertainty Estimation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー