Optimizing edge AI models on HPC systems with the edge in the loop

要約

エッジデバイスに展開された人工知能および機械学習モデル、たとえば、添加剤の製造（AM）の品質管理のために、サイズが小さいことがよくあります。
このようなモデルは通常、短い時間枠内で非常に正確な結果を提供する必要があります。
文献で一般的に採用されている方法は、より大きな訓練されたモデルから始まり、構造的な剪定、知識の蒸留、または量子化によって記憶と潜時フットプリントを減らすようにします。
ただし、最適化された構成を見つけるためにアーキテクチャスペースを体系的に調査しようとするアプローチである、ハードウェアを意識したニューラルアーキテクチャ検索（NAS）を活用することも可能です。
この研究では、ベルギーにあるエッジデバイスをドイツの強力な高性能コンピューティングシステムと結びつけるハードウェアを意識したNASワークフローを導入し、ターゲットハードウェアでリアルタイムのレイテンシ測定を実行しながら、可能なアーキテクチャ候補をできるだけ早く訓練します。
このアプローチは、Open Raise-LPBFデータセットに基づいてAMドメインのユースケースで検証され、人間が設計したベースラインと比較して、モデルの品質を〜1.35の係数で同時に増強すると同時に、推測速度が8.8倍高くなります。

要約(オリジナル)

Artificial intelligence and machine learning models deployed on edge devices, e.g., for quality control in Additive Manufacturing (AM), are frequently small in size. Such models usually have to deliver highly accurate results within a short time frame. Methods that are commonly employed in literature start out with larger trained models and try to reduce their memory and latency footprint by structural pruning, knowledge distillation, or quantization. It is, however, also possible to leverage hardware-aware Neural Architecture Search (NAS), an approach that seeks to systematically explore the architecture space to find optimized configurations. In this study, a hardware-aware NAS workflow is introduced that couples an edge device located in Belgium with a powerful High-Performance Computing system in Germany, to train possible architecture candidates as fast as possible while performing real-time latency measurements on the target hardware. The approach is verified on a use case in the AM domain, based on the open RAISE-LPBF dataset, achieving ~8.8 times faster inference speed while simultaneously enhancing model quality by a factor of ~1.35, compared to a human-designed baseline.

arxiv情報

著者	Marcel Aach,Cyril Blanc,Andreas Lintermann,Kurt De Grave
発行日	2025-05-26 13:47:36+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Optimizing edge AI models on HPC systems with the edge in the loop

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー