One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts

要約

この研究では、SAT と呼ばれる、テキストプロンプトによって駆動され、医療シナリオであらゆるものをセグメント化できるモデルの構築に焦点を当てています。
私たちの主な貢献は 3 つあります。(i) データ構築において、複数の知識ソースを組み合わせて、マルチモーダルな医療知識ツリーを構築します。
次に、ビジュアルスキャンとラベルスペースの両方を慎重に標準化して、31 のセグメンテーションデータセットから 11,000 を超える 3D 医療画像スキャンを収集し、トレーニング用の大規模なセグメンテーションデータセットを構築します。
(ii) モデルのトレーニングでは、医療用語をテキスト形式で入力することで実行できるユニバーサルセグメンテーションモデルを定式化します。
知識強化表現学習フレームワークと、多数のデータセットの組み合わせを効果的にトレーニングするための一連の戦略を紹介します。
(iii) モデル評価では、わずか 1 億 700 万のパラメータで SAT-Nano をトレーニングし、テキストプロンプトで 31 の異なるセグメンテーションデータセットをセグメント化し、結果として 362 のカテゴリが得られます。
身体領域ごとの平均、クラスごとの平均、データセットごとの平均の 3 つの側面からモデルを徹底的に評価し、36 の専門的な nnUNet と同等のパフォーマンスを実証します。つまり、各データセット/サブセットで nnUNet モデルをトレーニングし、結果として約 1000M の nnUNet が 36 個得られます。
31 個のデータセットのパラメータ。
このレポートで使用されているすべてのコードとモデル、つまり SAT-Nano を公開します。
さらに、より多様なデータセット上で、より大きなサイズのモデルでトレーニングされた SAT-Ultra を近い将来提供する予定です。
ウェブページの URL: https://zhaoziheng.github.io/MedUniSeg。

要約(オリジナル)

In this study, we focus on building up a model that can Segment Anything in medical scenarios, driven by Text prompts, termed as SAT. Our main contributions are three folds: (i) on data construction, we combine multiple knowledge sources to construct a multi-modal medical knowledge tree; Then we build up a large-scale segmentation dataset for training, by collecting over 11K 3D medical image scans from 31 segmentation datasets with careful standardization on both visual scans and label space; (ii) on model training, we formulate a universal segmentation model, that can be prompted by inputting medical terminologies in text form. We present a knowledge-enhanced representation learning framework, and a series of strategies for effectively training on the combination of a large number of datasets; (iii) on model evaluation, we train a SAT-Nano with only 107M parameters, to segment 31 different segmentation datasets with text prompt, resulting in 362 categories. We thoroughly evaluate the model from three aspects: averaged by body regions, averaged by classes, and average by datasets, demonstrating comparable performance to 36 specialist nnUNets, i.e., we train nnUNet models on each dataset/subset, resulting in 36 nnUNets with around 1000M parameters for the 31 datasets. We will release all the codes, and models used in this report, i.e., SAT-Nano. Moreover, we will offer SAT-Ultra in the near future, which is trained with model of larger size, on more diverse datasets. Webpage URL: https://zhaoziheng.github.io/MedUniSeg.

arxiv情報

著者	Ziheng Zhao,Yao Zhang,Chaoyi Wu,Xiaoman Zhang,Ya Zhang,Yanfeng Wang,Weidi Xie
発行日	2023-12-28 18:16:00+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー