Adaptive Deep Neural Network Inference Optimization with EENet

要約

よく訓練されたディープニューラルネットワーク（DNN）は、予測時にすべてのテストサンプルを平等に扱う。早期終了を用いた適応的なDNN推論は、あるテスト例が他のテスト例よりも予測しやすいという観測を活用する。本稿では、マルチ出口DNNモデルのための新しい早期出口スケジューリングフレームワークであるEENetを紹介する。EENetは、予測中に全てのサンプルを全てのDNN層を通過させる代わりに、早期終了スケジューラを学習し、モデルが早期終了の確信度が高い特定の予測に対して、インテリジェントに推論を早期に終了させることができる。ヒューリスティックに基づく手法によるこれまでの早期終了ソリューションとは対照的に、我々のEENetフレームワークは、与えられたサンプルあたりの平均推論バジェットを満たしながら、モデルの精度を最大化するために早期終了ポリシーを最適化する。4つのコンピュータビジョンデータセット(CIFAR-10, CIFAR-100, ImageNet, Cityscapes)と2つの自然言語データセット(SST-2, AgNews)を用いて広範な実験を行った。その結果、EENetによる適応的推論が、代表的な既存の早期終了技術を凌駕することが実証された。また、EENetの利点を解釈するために、比較結果の詳細な可視化分析を行った。

要約(オリジナル)

Well-trained deep neural networks (DNNs) treat all test samples equally during prediction. Adaptive DNN inference with early exiting leverages the observation that some test examples can be easier to predict than others. This paper presents EENet, a novel early-exiting scheduling framework for multi-exit DNN models. Instead of having every sample go through all DNN layers during prediction, EENet learns an early exit scheduler, which can intelligently terminate the inference earlier for certain predictions, which the model has high confidence of early exit. As opposed to previous early-exiting solutions with heuristics-based methods, our EENet framework optimizes an early-exiting policy to maximize model accuracy while satisfying the given per-sample average inference budget. Extensive experiments are conducted on four computer vision datasets (CIFAR-10, CIFAR-100, ImageNet, Cityscapes) and two NLP datasets (SST-2, AgNews). The results demonstrate that the adaptive inference by EENet can outperform the representative existing early exit techniques. We also perform a detailed visualization analysis of the comparison results to interpret the benefits of EENet.

arxiv情報

著者	Fatih Ilhan,Ka-Ho Chow,Sihao Hu,Tiansheng Huang,Selim Tekin,Wenqi Wei,Yanzhao Wu,Myungjin Lee,Ramana Kompella,Hugo Latapie,Gaowen Liu,Ling Liu
発行日	2023-12-01 17:12:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Adaptive Deep Neural Network Inference Optimization with EENet

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー