SAM2CLIP2SAM: Vision Language Model for Segmentation of 3D CT Scans for Covid-19 Detection

要約

この論文では、あらゆるモデルや方法論に統合できる画像の効果的なセグメンテーションのための新しいアプローチを紹介します。
私たちが選択したパラダイムは、Covid-19 検出のための医療画像 (3D 胸部 CT スキャン) の分類です。
私たちのアプローチには、CT スキャンをセグメント化する視覚言語モデルの組み合わせが含まれており、それが Covid-19 検出のために RACNet というディープニューラルアーキテクチャに供給されます。
特に、SAM2CLIP2SAM という新しいフレームワークがセグメンテーション用に導入されており、セグメント何でもモデル (SAM) と対照言語画像事前トレーニング (CLIP) の両方の長所を活用して、CT スキャンで右肺と左肺を正確にセグメント化します。
これらのセグメント化された出力を RACNet にフィードして、新型コロナウイルス感染症と非新型コロナウイルス感染症のケースを分類します。
まず、SAM は CT スキャンのスライスごとに複数のパーツベースのセグメンテーションマスクを生成します。
その後、CLIP は関心領域 (ROI)、つまり右肺と左肺に関連付けられたマスクのみを選択します。
最後に、SAM にはこれらの ROI がプロンプトとして与えられ、肺の最終的なセグメンテーションマスクが生成されます。
実験は 2 つの Covid-19 注釈付きデータベースにわたって提示されており、CT スキャンのセグメンテーションに私たちの方法を使用したときに得られるパフォーマンスの向上を示しています。

要約(オリジナル)

This paper presents a new approach for effective segmentation of images that can be integrated into any model and methodology; the paradigm that we choose is classification of medical images (3-D chest CT scans) for Covid-19 detection. Our approach includes a combination of vision-language models that segment the CT scans, which are then fed to a deep neural architecture, named RACNet, for Covid-19 detection. In particular, a novel framework, named SAM2CLIP2SAM, is introduced for segmentation that leverages the strengths of both Segment Anything Model (SAM) and Contrastive Language-Image Pre-Training (CLIP) to accurately segment the right and left lungs in CT scans, subsequently feeding these segmented outputs into RACNet for classification of COVID-19 and non-COVID-19 cases. At first, SAM produces multiple part-based segmentation masks for each slice in the CT scan; then CLIP selects only the masks that are associated with the regions of interest (ROIs), i.e., the right and left lungs; finally SAM is given these ROIs as prompts and generates the final segmentation mask for the lungs. Experiments are presented across two Covid-19 annotated databases which illustrate the improved performance obtained when our method has been used for segmentation of the CT scans.

arxiv情報

著者	Dimitrios Kollias,Anastasios Arsenos,James Wingate,Stefanos Kollias
発行日	2024-07-22 15:31:18+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SAM2CLIP2SAM: Vision Language Model for Segmentation of 3D CT Scans for Covid-19 Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー