Adapting the Segment Anything Model During Usage in Novel Situations

要約

インタラクティブなセグメンテーションタスクは、ユーザーインタラクションに基づいてオブジェクトセグメンテーションマスクを作成することで構成されます。
モデルを正しいセグメンテーションの生成に導く最も一般的な方法は、オブジェクトと背景をクリックすることです。
最近公開された Segment Anything Model (SAM) は、インタラクティブセグメンテーション問題の一般化バージョンをサポートしており、1.1B マスクを含むオブジェクトセグメンテーションデータセットでトレーニングされています。
基礎モデルとして機能するという明確な目的を持って広範囲にトレーニングされていますが、新しいドメインまたはオブジェクトタイプの対話型セグメンテーションに適用される場合、SAM には重大な制限があることがわかります。
使用されたデータセットでは、SAM は最大 $72.6 \%$ の失敗率 $\text{FR}_{30}@90$ を表示します。
このような基礎モデルをすぐに適用できるようにする必要があるため、すぐに使用する際に SAM を適応させることができるフレームワークを紹介します。
このために、インタラクティブなセグメンテーションプロセス中に構築されるユーザーインタラクションとマスクを活用します。
この情報を使用して擬似ラベルを生成し、それを使用して損失関数を計算し、SAM モデルの一部を最適化します。
提示された方法により、$\text{FR}_{20}@85$ メトリクスでは最大 $48.1 \%$、$\text{FR}_{30}@90$ メトリクスでは $46.6 \%$ の相対的な削減が発生します。
。

要約(オリジナル)

The interactive segmentation task consists in the creation of object segmentation masks based on user interactions. The most common way to guide a model towards producing a correct segmentation consists in clicks on the object and background. The recently published Segment Anything Model (SAM) supports a generalized version of the interactive segmentation problem and has been trained on an object segmentation dataset which contains 1.1B masks. Though being trained extensively and with the explicit purpose of serving as a foundation model, we show significant limitations of SAM when being applied for interactive segmentation on novel domains or object types. On the used datasets, SAM displays a failure rate $\text{FR}_{30}@90$ of up to $72.6 \%$. Since we still want such foundation models to be immediately applicable, we present a framework that can adapt SAM during immediate usage. For this we will leverage the user interactions and masks, which are constructed during the interactive segmentation process. We use this information to generate pseudo-labels, which we use to compute a loss function and optimize a part of the SAM model. The presented method causes a relative reduction of up to $48.1 \%$ in the $\text{FR}_{20}@85$ and $46.6 \%$ in the $\text{FR}_{30}@90$ metrics.

arxiv情報

著者	Robin Schön,Julian Lorenz,Katja Ludwig,Rainer Lienhart
発行日	2024-04-12 12:10:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Adapting the Segment Anything Model During Usage in Novel Situations

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー