QMaxViT-Unet+: A Query-Based MaxViT-Unet with Edge Enhancement for Scribble-Supervised Segmentation of Medical Images

要約

医療画像セグメンテーションのための高度なディープラーニングモデルの展開は、多くの場合、広範囲に注釈付きのデータセットの要件によって制約されます。
あまり正確なラベルを可能にする弱く監視されている学習は、この課題に対する有望な解決策となっています。
このアプローチに基づいて、Scribble-supervised Medical Imageセグメンテーションのための新しいフレームワークであるQmaxvit-Unet+を提案します。
このフレームワークは、U-NETアーキテクチャ上に構築されており、エンコーダーとデコーダーは多軸ビジョントランス（MAXVIT）ブロックに置き換えられます。
これらのブロックは、ローカルおよびグローバルな機能を効率的に学習するモデルの能力を高めます。
さらに、当社のアプローチでは、クエリベースのトランスデコーダーを統合して、特徴とエッジエンハンスメントモジュールを改良し、Scribbleラベルの限られた境界情報を補正します。
心臓構造、結腸直腸ポリープ、およびACDC、MS-CMRSEG、SUN-SEG、およびBUSIに焦点を当てた4つのパブリックデータセットで提案されたQMAXVIT-UNET+を評価します。
評価メトリックには、サイコロの類似性係数（DSC）とHausdorff距離（HD95）の95パーセンタイルが含まれます。
実験結果は、Qmaxvit-Unet+がACDCで89.1 \％DSCおよび1.316mm HD95、MS-CMRSEG、71.4 \％DSCおよび4.996mm HD95で89.1 \％DSC、2.226mm HD95を達成し、SUN-SEGで4.996mm HD95を達成し、69.4 \％DSCで達成したことを示しています。
Busiの50.122mm HD95。
これらの結果は、私たちの方法が、完全に監視された学習アプローチと競争力を維持しながら、精度、堅牢性、効率の観点から既存のアプローチを上回ることを示しています。
これにより、高品質の注釈が不足しており、かなりの努力と費用が必要な医療画像分析に最適です。
このコードは、https：//github.com/anpc849/qmaxvit-unetで入手できます

要約(オリジナル)

The deployment of advanced deep learning models for medical image segmentation is often constrained by the requirement for extensively annotated datasets. Weakly-supervised learning, which allows less precise labels, has become a promising solution to this challenge. Building on this approach, we propose QMaxViT-Unet+, a novel framework for scribble-supervised medical image segmentation. This framework is built on the U-Net architecture, with the encoder and decoder replaced by Multi-Axis Vision Transformer (MaxViT) blocks. These blocks enhance the model’s ability to learn local and global features efficiently. Additionally, our approach integrates a query-based Transformer decoder to refine features and an edge enhancement module to compensate for the limited boundary information in the scribble label. We evaluate the proposed QMaxViT-Unet+ on four public datasets focused on cardiac structures, colorectal polyps, and breast cancer: ACDC, MS-CMRSeg, SUN-SEG, and BUSI. Evaluation metrics include the Dice similarity coefficient (DSC) and the 95th percentile of Hausdorff distance (HD95). Experimental results show that QMaxViT-Unet+ achieves 89.1\% DSC and 1.316mm HD95 on ACDC, 88.4\% DSC and 2.226mm HD95 on MS-CMRSeg, 71.4\% DSC and 4.996mm HD95 on SUN-SEG, and 69.4\% DSC and 50.122mm HD95 on BUSI. These results demonstrate that our method outperforms existing approaches in terms of accuracy, robustness, and efficiency while remaining competitive with fully-supervised learning approaches. This makes it ideal for medical image analysis, where high-quality annotations are often scarce and require significant effort and expense. The code is available at: https://github.com/anpc849/QMaxViT-Unet

arxiv情報

著者	Thien B. Nguyen-Tat,Hoang-An Vo,Phuoc-Sang Dang
発行日	2025-02-14 16:56:24+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

QMaxViT-Unet+: A Query-Based MaxViT-Unet with Edge Enhancement for Scribble-Supervised Segmentation of Medical Images

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー