Efficiency 360: Efficient Vision Transformers

要約

トランスフォーマーは、自然言語処理、コンピュータービジョン、音声、および音楽の分野でタスクを解決するために広く使用されています。
このホワイトペーパーでは、メモリ (パラメーターの数)、計算コスト (浮動小数点演算の数)、および精度、モデルの堅牢性、公平性を含むモデルのパフォーマンスの観点から、変換器の効率について説明します \&
バイアスフリー機能。
主に、画像分類タスクのビジョントランスフォーマーについて説明します。
私たちの貢献は、ビジョントランスフォーマーのさまざまな側面を含む効率的な 360 フレームワークを導入して、産業用アプリケーションの効率を高めることです。
これらのアプリケーションを検討することにより、プライバシー、堅牢性、透明性、公平性、包括性、継続的な学習、確率モデル、近似、計算の複雑さ、スペクトルの複雑さなどの複数の次元に分類します。
パフォーマンス、パラメーターの数、および複数のデータセットに対する浮動小数点演算 (FLOP) の数に基づいて、さまざまなビジョントランスフォーマーモデルを比較します。

要約(オリジナル)

Transformers are widely used for solving tasks in natural language processing, computer vision, speech, and music domains. In this paper, we talk about the efficiency of transformers in terms of memory (the number of parameters), computation cost (number of floating points operations), and performance of models, including accuracy, the robustness of the model, and fair \& bias-free features. We mainly discuss the vision transformer for the image classification task. Our contribution is to introduce an efficient 360 framework, which includes various aspects of the vision transformer, to make it more efficient for industrial applications. By considering those applications, we categorize them into multiple dimensions such as privacy, robustness, transparency, fairness, inclusiveness, continual learning, probabilistic models, approximation, computational complexity, and spectral complexity. We compare various vision transformer models based on their performance, the number of parameters, and the number of floating point operations (FLOPs) on multiple datasets.

arxiv情報

著者	Badri N. Patro,Vijay Srinivas Agneeswaran
発行日	2023-02-23 19:36:20+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Efficiency 360: Efficient Vision Transformers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー