Memorizing SAM: 3D Medical Segment Anything Model with Memorizing Transformer

要約

Segment Anything Model (SAM) は、適切なユーザープロンプトが提供された場合に、目に見えないクラスやドメインのオブジェクトをセグメント化するゼロショット汎化機能により、医用画像分析でますます注目を集めています。
このパフォーマンスギャップに対処することは、特に精度が重要であるが、微調整用の十分に注釈が付けられた 3D 医療データが限られている体積医療画像セグメンテーションの領域において、SAM の事前トレーニングされた重みを最大限に活用するために重要です。
この研究では、メモリメカニズムをプラグインとして導入することで、特に過去の入力の内部表現を記憶して呼び出す機能を導入することで、限られた計算コストで SAM のパフォーマンスを向上できるかどうかを調査します。
この目的を達成するために、メモリ Transformer をプラグインとして組み込んだ新しい 3D SAM アーキテクチャである Memorizing SAM を提案します。
学習や推論中に内部表現を保存する従来の暗記トランスフォーマーとは異なり、当社の暗記SAMは既存の高精度な内部表現をメモリソースとして利用し、メモリの品質を確保します。
TotalSegmentator データセットから 33 のカテゴリで Memorizing SAM のパフォーマンスを評価しました。これは、Memorizing SAM が最先端の 3D SAM バリアント、つまり、わずか 4.38 ミリ秒のコストで平均 Dice 増加率 11.36% の FastSAM3D を上回るパフォーマンスを発揮できることを示しています。
推論時間の増加。
ソースコードは https://github.com/swedfr/memorizingSAM で公開されています。

要約(オリジナル)

Segment Anything Models (SAMs) have gained increasing attention in medical image analysis due to their zero-shot generalization capability in segmenting objects of unseen classes and domains when provided with appropriate user prompts. Addressing this performance gap is important to fully leverage the pre-trained weights of SAMs, particularly in the domain of volumetric medical image segmentation, where accuracy is important but well-annotated 3D medical data for fine-tuning is limited. In this work, we investigate whether introducing the memory mechanism as a plug-in, specifically the ability to memorize and recall internal representations of past inputs, can improve the performance of SAM with limited computation cost. To this end, we propose Memorizing SAM, a novel 3D SAM architecture incorporating a memory Transformer as a plug-in. Unlike conventional memorizing Transformers that save the internal representation during training or inference, our Memorizing SAM utilizes existing highly accurate internal representation as the memory source to ensure the quality of memory. We evaluate the performance of Memorizing SAM in 33 categories from the TotalSegmentator dataset, which indicates that Memorizing SAM can outperform state-of-the-art 3D SAM variant i.e., FastSAM3D with an average Dice increase of 11.36% at the cost of only 4.38 millisecond increase in inference time. The source code is publicly available at https://github.com/swedfr/memorizingSAM

arxiv情報

著者	Xinyuan Shao,Yiqing Shen,Mathias Unberath
発行日	2024-12-18 14:51:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Memorizing SAM: 3D Medical Segment Anything Model with Memorizing Transformer

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー