MS-Twins: Multi-Scale Deep Self-Attention Networks for Medical Image Segmentation

要約

自然言語処理ではトランスフォーマーが好まれていますが、一部の研究は近年医療画像の分野にのみ適用されています。
長期的な依存性のため、このトランスフォーマーは、型破りな畳み込みニューラルネットワークが固有の空間誘導バイアスを克服することに貢献すると期待されています。
最近提案されたトランスフォーマーベースのセグメンテーション方法は、グローバルコンテキストを畳み込み表現にエンコードするのに役立つ補助モジュールとしてトランスフォーマーを使用するだけです。
自己注意と畳み込みを最適に統合する方法は詳しく調査されていません。
この問題を解決するために、この論文では、自己注意と畳み込みの結合による強力なセグメンテーションモデルである MS-Twins (Multi-Scale Twins) を提案します。
MS-Twins は、さまざまなスケールとカスケード機能を組み合わせることにより、セマンティックで詳細な情報をより適切に取得できます。
既存のネットワーク構造と比較して、MS-Twins は、一般的に使用される 2 つのデータセット、Synapse と ACDC の変換に基づいた以前の方法を進歩させました。
特に、Synapse 上の MS-Twins のパフォーマンスは SwinUNet より 8% 優れています。
最高の完全に複雑な医療画像セグメンテーションネットワークである nnUNet と比較しても、Synapse および ACDC 上の MS-Twins のパフォーマンスにはまだ多少の利点があります。

要約(オリジナル)

Although transformer is preferred in natural language processing, some studies has only been applied to the field of medical imaging in recent years. For its long-term dependency, the transformer is expected to contribute to unconventional convolution neural net conquer their inherent spatial induction bias. The lately suggested transformer-based segmentation method only uses the transformer as an auxiliary module to help encode the global context into a convolutional representation. How to optimally integrate self-attention with convolution has not been investigated in depth. To solve the problem, this paper proposes MS-Twins (Multi-Scale Twins), which is a powerful segmentation model on account of the bond of self-attention and convolution. MS-Twins can better capture semantic and fine-grained information by combining different scales and cascading features. Compared with the existing network structure, MS-Twins has made progress on the previous method based on the transformer of two in common use data sets, Synapse and ACDC. In particular, the performance of MS-Twins on Synapse is 8% higher than SwinUNet. Even compared with nnUNet, the best entirely convoluted medical image segmentation network, the performance of MS-Twins on Synapse and ACDC still has a bit advantage.

arxiv情報

著者	Jing Xu
発行日	2024-09-16 17:40:31+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

MS-Twins: Multi-Scale Deep Self-Attention Networks for Medical Image Segmentation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー