Online Algorithms for Hierarchical Inference in Deep Learning applications at the Edge

要約

汎用分類アプリケーション用の小型 ML モデル (S-ML) と、アプリケーションをホストするエッジサーバー (ES) が組み込まれた、IoT センサーやマイクロコントローラーユニットなど、リソースに制約のあるエッジデバイス (ED) を検討します。
大型MLモデル（L-ML）。
S-ML の推論精度は L-ML の推論精度よりも低いため、すべてのデータサンプルを ES にオフロードすると推論精度は高くなりますが、ED に S-ML を埋め込む目的が損なわれ、S-ML の利点が失われます。
ローカル推論を実行する際の遅延の削減、帯域幅の節約、エネルギー効率が向上します。
両方の世界、つまり ED で推論を実行する利点と ES で推論を実行する利点を最大限に活用するために、S-ML 推論が受け入れられるのは次の場合にのみである階層推論 (HI) のアイデアを検討します。
それは正しいです。そうでない場合、データサンプルは L-ML 推論のためにオフロードされます。
ただし、S-ML 推論の正確さが ED に知られていないため、HI の理想的な実装は実現不可能です。
我々は、ED が S-ML 推論の正しさを予測するために使用できるオンラインメタ学習フレームワークを提案します。
特に、データサンプルに対して S-ML によって出力されたソフトマックスの最大値を使用し、それをオフロードするかどうかを決定することを提案します。
結果として生じるオンライン学習の問題は、継続的なエキスパートスペースを伴うエキスパートアドバイス付き予測 (PEA) 問題であることがわかります。
我々は 2 つの異なるアルゴリズムを提案し、損失関数の平滑性をまったく仮定せずに、それらのアルゴリズムのサブリニアリグレス限界を証明します。
我々は、Imagenette と Imagewoof、MNIST、CIFAR-10 の 4 つのデータセットを使用して、画像分類アプリケーション用に提案されたアルゴリズムのパフォーマンスを評価およびベンチマークします。

要約(オリジナル)

We consider a resource-constrained Edge Device (ED), such as an IoT sensor or a microcontroller unit, embedded with a small-size ML model (S-ML) for a generic classification application and an Edge Server (ES) that hosts a large-size ML model (L-ML). Since the inference accuracy of S-ML is lower than that of the L-ML, offloading all the data samples to the ES results in high inference accuracy, but it defeats the purpose of embedding S-ML on the ED and deprives the benefits of reduced latency, bandwidth savings, and energy efficiency of doing local inference. In order to get the best out of both worlds, i.e., the benefits of doing inference on the ED and the benefits of doing inference on ES, we explore the idea of Hierarchical Inference (HI), wherein S-ML inference is only accepted when it is correct, otherwise the data sample is offloaded for L-ML inference. However, the ideal implementation of HI is infeasible as the correctness of the S-ML inference is not known to the ED. We propose an online meta-learning framework that the ED can use to predict the correctness of the S-ML inference. In particular, we propose to use the maximum softmax value output by S-ML for a data sample and decide whether to offload it or not. The resulting online learning problem turns out to be a Prediction with Expert Advice (PEA) problem with continuous expert space. We propose two different algorithms and prove sublinear regret bounds for them without any assumption on the smoothness of the loss function. We evaluate and benchmark the performance of the proposed algorithms for image classification application using four datasets, namely, Imagenette and Imagewoof, MNIST, and CIFAR-10.

arxiv情報

著者	Vishnu Narayanan Moothedath,Jaya Prakash Champati,James Gross
発行日	2024-02-15 15:56:37+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Online Algorithms for Hierarchical Inference in Deep Learning applications at the Edge

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー