Attention-based Feature Compression for CNN Inference Offloading in Edge Computing

要約

この論文では、デバイスエッジの共同推論システムにおける CNN 推論の計算オフロードについて研究しています。
新たなパラダイムセマンティックコミュニケーションに着想を得て、エンドデバイスでの効果的な特徴抽出のために、新しいオートエンコーダーベースの CNN アーキテクチャ (AECNN) を提案します。
最も重要な特徴を選択して中間データを圧縮するために、CNN のチャネルアテンション法に基づく特徴圧縮モジュールを設計します。
通信のオーバーヘッドをさらに削減するために、エントロピーエンコーディングを使用して、圧縮データの統計的冗長性を取り除くことができます。
受信側では、受信した圧縮データから学習して中間データを再構築し、精度を向上させる軽量のデコーダを設計します。
収束を速めるために、段階的なアプローチを使用して、ResNet-50 アーキテクチャに基づいて得られたニューラルネットワークをトレーニングします。
実験結果は、AECNN が約 4% の精度損失で中間データを 256 倍以上圧縮できることを示しており、これは最先端の作業である BottleNet++ よりも優れています。
推論タスクをエッジサーバーに直接オフロードする場合と比較して、AECNN は、特に無線チャネルの状態が悪い場合に、推論タスクをより早く完了することができます。これは、時間の制約内でより高い精度を保証する AECNN の有効性を強調しています。

要約(オリジナル)

This paper studies the computational offloading of CNN inference in device-edge co-inference systems. Inspired by the emerging paradigm semantic communication, we propose a novel autoencoder-based CNN architecture (AECNN), for effective feature extraction at end-device. We design a feature compression module based on the channel attention method in CNN, to compress the intermediate data by selecting the most important features. To further reduce communication overhead, we can use entropy encoding to remove the statistical redundancy in the compressed data. At the receiver, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy. To fasten the convergence, we use a step-by-step approach to train the neural networks obtained based on ResNet-50 architecture. Experimental results show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss, which outperforms the state-of-the-art work, BottleNet++. Compared to offloading inference task directly to edge server, AECNN can complete inference task earlier, in particular, under poor wireless channel condition, which highlights the effectiveness of AECNN in guaranteeing higher accuracy within time constraint.

arxiv情報

著者	Nan Li,Alexandros Iosifidis,Qi Zhang
発行日	2023-02-10 16:16:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Attention-based Feature Compression for CNN Inference Offloading in Edge Computing

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー