Computer Vision Model Compression Techniques for Embedded Systems: A Survey

要約

ディープニューラルネットワークは、ほとんどのコンピュータービジョンの問題において常に最先端の技術を提供してきました。
これらのシナリオでは、特に多くの代表的なデータを使用してトレーニングした場合、大規模で複雑なモデルが小規模なアーキテクチャよりも優れたパフォーマンスを示しています。
最近のビジョントランスフォーマー (ViT) ベースのアーキテクチャと高度な畳み込みニューラルネットワーク (CNN) の採用により、主要なバックボーンアーキテクチャのパラメータの総数は、AlexNet による 2012 年の 6,200 万パラメータから、AIM-7B による 2024 年の 70 億パラメータに増加しました。
その結果、このようなディープアーキテクチャの導入は、処理と実行時間の制約がある環境、特に組み込みシステムでは課題に直面します。
このペーパーでは、コンピュータービジョンタスクに適用され、最新のモデルを組み込みシステムで使用できるようにする主なモデル圧縮技術について説明します。
圧縮サブエリアの特性を示し、さまざまなアプローチを比較し、さまざまな組み込みデバイスで分析する際に最適な手法と予想されるバリエーションを選択する方法について説明します。
また、研究者や新しい専門家が各サブエリアの初期実装の課題を克服するのを支援するコードを共有し、モデル圧縮の傾向を示します。
圧縮モデルのケーススタディは、\href{https://github.com/venturusbr/cv-model-compression}{https://github.com/venturusbr/cv-model-compression} でご覧いただけます。

要約(オリジナル)

Deep neural networks have consistently represented the state of the art in most computer vision problems. In these scenarios, larger and more complex models have demonstrated superior performance to smaller architectures, especially when trained with plenty of representative data. With the recent adoption of Vision Transformer (ViT) based architectures and advanced Convolutional Neural Networks (CNNs), the total number of parameters of leading backbone architectures increased from 62M parameters in 2012 with AlexNet to 7B parameters in 2024 with AIM-7B. Consequently, deploying such deep architectures faces challenges in environments with processing and runtime constraints, particularly in embedded systems. This paper covers the main model compression techniques applied for computer vision tasks, enabling modern models to be used in embedded systems. We present the characteristics of compression subareas, compare different approaches, and discuss how to choose the best technique and expected variations when analyzing it on various embedded devices. We also share codes to assist researchers and new practitioners in overcoming initial implementation challenges for each subarea and present trends for Model Compression. Case studies for compression models are available at \href{https://github.com/venturusbr/cv-model-compression}{https://github.com/venturusbr/cv-model-compression}.

arxiv情報

著者	Alexandre Lopes,Fernando Pereira dos Santos,Diulhio de Oliveira,Mauricio Schiezaro,Helio Pedrini
発行日	2024-08-15 16:41:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Computer Vision Model Compression Techniques for Embedded Systems: A Survey

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー