SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation

要約

ビジョントランスフォーマーの導入以来、CNN によって圧倒的に支配されてきた多くのコンピュータービジョンタスク (セマンティックセグメンテーションなど) の状況は、最近大幅に革命を起こしました。
ただし、計算コストとメモリ要件により、これらの方法は、特に高解像度のピクセルごとのセマンティックセグメンテーションタスクの場合、モバイルデバイスでは不適切になります。
この論文では、モバイルセマンティックセグメンテーションのための新しい方法であるスクイーズ強化アキシャルトランスフォーマー (SeaFormer) を紹介します。
具体的には、スクイーズアキシャルの配合とディテール強化を特徴とする一般的なアテンションブロックを設計します。
さらに、優れた費用対効果を持つバックボーンアーキテクチャのファミリを作成するために使用できます。
軽いセグメンテーションヘッドと組み合わせることで、ADE20K および Cityscapes データセットの ARM ベースのモバイルデバイスで、セグメンテーションの精度と遅延の間で最適なトレードオフを実現します。
重要なのは、モバイルフレンドリーなライバルと Transformer ベースのライバルの両方を、追加機能なしで優れたパフォーマンスと低レイテンシで打ち負かしたことです。
セマンティックセグメンテーションを超えて、提案された SeaFormer アーキテクチャを画像分類問題にさらに適用し、汎用性の高いモバイルフレンドリーなバックボーンとして機能する可能性を示します。

要約(オリジナル)

Since the introduction of Vision Transformers, the landscape of many computer vision tasks (e.g., semantic segmentation), which has been overwhelmingly dominated by CNNs, recently has significantly revolutionized. However, the computational cost and memory requirement render these methods unsuitable on the mobile device, especially for the high-resolution per-pixel semantic segmentation task. In this paper, we introduce a new method squeeze-enhanced Axial TransFormer (SeaFormer) for mobile semantic segmentation. Specifically, we design a generic attention block characterized by the formulation of squeeze Axial and detail enhancement. It can be further used to create a family of backbone architectures with superior cost-effectiveness. Coupled with a light segmentation head, we achieve the best trade-off between segmentation accuracy and latency on the ARM-based mobile devices on the ADE20K and Cityscapes datasets. Critically, we beat both the mobile-friendly rivals and Transformer-based counterparts with better performance and lower latency without bells and whistles. Beyond semantic segmentation, we further apply the proposed SeaFormer architecture to image classification problem, demonstrating the potentials of serving as a versatile mobile-friendly backbone.

arxiv情報

著者	Qiang Wan,Zilong Huang,Jiachen Lu,Gang Yu,Li Zhang
発行日	2023-01-30 18:34:16+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー