From SAM to SAM 2: Exploring Improvements in Meta’s Segment Anything Model

要約

2023 年 4 月に Meta によってコンピュータービジョンコミュニティに導入された Segment Anything Model (SAM) は、テキスト、クリック、境界ボックスなどのプロンプトに基づいて画像内のオブジェクトを自動セグメンテーションできる画期的なツールです。
SAM はゼロショットパフォーマンスに優れており、追加のトレーニングなしで、10 億を超える画像マスクの大規模なデータセットによって刺激されて、目に見えないオブジェクトをセグメント化します。
SAM 2 はこの機能をビデオに拡張し、前後のフレームのメモリを利用してビデオ全体にわたって正確なセグメンテーションを生成し、ほぼリアルタイムのパフォーマンスを可能にします。
この比較は、さまざまなアプリケーションで正確かつ効率的なセグメンテーションに対するニーズの高まりに応えるために SAM がどのように進化してきたかを示しています。
この研究は、SAM のようなモデルの将来の進歩が、コンピュータービジョンテクノロジの向上に不可欠であることを示唆しています。

要約(オリジナル)

The Segment Anything Model (SAM), introduced to the computer vision community by Meta in April 2023, is a groundbreaking tool that allows automated segmentation of objects in images based on prompts such as text, clicks, or bounding boxes. SAM excels in zero-shot performance, segmenting unseen objects without additional training, stimulated by a large dataset of over one billion image masks. SAM 2 expands this functionality to video, leveraging memory from preceding and subsequent frames to generate accurate segmentation across entire videos, enabling near real-time performance. This comparison shows how SAM has evolved to meet the growing need for precise and efficient segmentation in various applications. The study suggests that future advancements in models like SAM will be crucial for improving computer vision technology.

arxiv情報

著者	Athulya Sundaresan Geetha,Muhammad Hussain
発行日	2024-08-12 17:17:35+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

From SAM to SAM 2: Exploring Improvements in Meta’s Segment Anything Model

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー