HiFormer: Hierarchical Multi-scale Representations Using Transformers for Medical Image Segmentation

要約

畳み込みニューラルネットワーク(CNN)は、医療画像のセグメンテーションタスクのためのコンセンサスとなっている。しかし、畳み込み演算の性質上、長距離依存性や空間相関のモデル化には限界がある。この問題を解決するために、変換器が開発されたが、低レベルの特徴を捉えることができない。一方、困難な文脈におけるセグメンテーションのような高密度な予測には、局所的な特徴と大域的な特徴の両方が重要であることが実証されている。本論文では、医用画像セグメンテーションのために、CNNと変換器を効率的に橋渡しする新しい手法であるHiFormerを提案する。具体的には、Swin TransformerモジュールとCNNベースのエンコーダを用いて、2つのマルチスケール特徴表現を設計する。また、2つの特徴表現から得られる大域的・局所的な特徴を適切に融合するために、エンコーダ・デコーダ構造のスキップ接続部に2値化融合（DLF）モジュールを提案する。様々な医用画像セグメンテーションデータセットに対する広範な実験により、計算複雑性、定量的・定性的結果において、他のCNNベース、トランスフォーマベース、ハイブリッド手法に対するHiFormerの有効性が実証された。我々のコードは以下のサイトで公開されています： https://github.com/amirhossein-kz/HiFormer

要約(オリジナル)

Convolutional neural networks (CNNs) have been the consensus for medical image segmentation tasks. However, they suffer from the limitation in modeling long-range dependencies and spatial correlations due to the nature of convolution operation. Although transformers were first developed to address this issue, they fail to capture low-level features. In contrast, it is demonstrated that both local and global features are crucial for dense prediction, such as segmenting in challenging contexts. In this paper, we propose HiFormer, a novel method that efficiently bridges a CNN and a transformer for medical image segmentation. Specifically, we design two multi-scale feature representations using the seminal Swin Transformer module and a CNN-based encoder. To secure a fine fusion of global and local features obtained from the two aforementioned representations, we propose a Double-Level Fusion (DLF) module in the skip connection of the encoder-decoder structure. Extensive experiments on various medical image segmentation datasets demonstrate the effectiveness of HiFormer over other CNN-based, transformer-based, and hybrid methods in terms of computational complexity, and quantitative and qualitative results. Our code is publicly available at: https://github.com/amirhossein-kz/HiFormer

arxiv情報

著者	Moein Heidari,Amirhossein Kazerouni,Milad Soltany,Reza Azad,Ehsan Khodapanah Aghdam,Julien Cohen-Adad,Dorit Merhof
発行日	2023-01-09 15:06:30+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

HiFormer: Hierarchical Multi-scale Representations Using Transformers for Medical Image Segmentation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー