On-device AI: Quantization-aware Training of Transformers in Time-Series

要約

パーベイシブコンピューティングにおける時系列の人工知能 (AI) モデルは、大規模かつ複雑になり続けています。
Transformer モデルは、これらの AI モデルの中で最も魅力的です。
ただし、リソースが限られているセンサーデバイスにこのような大規模なモデルを展開すると、望ましいパフォーマンスを得るのは困難です。
私の研究は、時系列予測タスク用の Transformer モデルの最適化に焦点を当てています。
最適化されたモデルは、組み込みフィールドプログラマブルゲートアレイ (FPGA) 上のハードウェアアクセラレータとして展開されます。
FPGA の利点を最大化しながら、サイズとランタイムメモリフットプリントを削減するために、量子化対応トレーニングを Transformer モデルに適用した場合の影響を調査します。

要約(オリジナル)

Artificial Intelligence (AI) models for time-series in pervasive computing keep getting larger and more complicated. The Transformer model is by far the most compelling of these AI models. However, it is difficult to obtain the desired performance when deploying such a massive model on a sensor device with limited resources. My research focuses on optimizing the Transformer model for time-series forecasting tasks. The optimized model will be deployed as hardware accelerators on embedded Field Programmable Gate Arrays (FPGAs). I will investigate the impact of applying Quantization-aware Training to the Transformer model to reduce its size and runtime memory footprint while maximizing the advantages of FPGAs.

arxiv情報

著者	Tianheng Ling,Gregor Schiele
発行日	2024-08-29 12:49:22+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

On-device AI: Quantization-aware Training of Transformers in Time-Series

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー