Human-in-the-Loop Segmentation of Multi-species Coral Imagery

要約

ロボットの水中および水上探査機による海洋調査では、大量のサンゴ礁画像が得られますが、これらの画像にラベルを付けることは、専門家にとって費用と時間がかかります。
ポイントラベルの伝播は、まばらなポイントでラベル付けされた既存の画像を使用して、セマンティックセグメンテーションモデルのトレーニングに使用できる拡張グラウンドトゥルースデータを作成する手法です。
この研究では、大規模基礎モデルの最近の進歩により、事前トレーニングなしで、DINOv2 基礎モデルのノイズ除去バージョンと K 最近傍 (KNN) によって抽出された特徴のみを使用して、拡張グラウンドトゥルースマスクの作成が容易になったことを示します。
ラベルが非常にまばらな画像の場合、人間参加型の原理に基づいたラベル付け方法を提案します。これにより、注釈の効率が大幅に向上します。画像ごとに 5 つのポイントラベルがある場合、人間参加型の方法が使用されます。
従来の最先端技術よりもピクセル精度で 14.2%、mIoU で 19.7% 優れています。
10 ポイントのラベルがある場合は 8.9% と 18.3% ずつ増加します。
人間参加によるラベル付けが利用できない場合でも、KNN でノイズ除去された DINOv2 機能を使用すると、以前の最先端技術よりもピクセル精度が 2.7%、mIoU (5 グリッドポイント) が 5.8% 向上します。
セマンティックセグメンテーションタスクでは、ポイントラベルの伝播に 5 つのポイントラベルのみを使用した場合、ピクセル精度で 8.8%、mIoU で 13.5% と従来の最先端技術を上回りました。
さらに、ポイントラベルの配置スタイルとポイントの数がポイントラベルの伝播品質に与える影響について包括的な調査を実行し、ポイントによる画像のラベル付けの効率を向上させるためのいくつかの推奨事項を作成します。

要約(オリジナル)

Marine surveys by robotic underwater and surface vehicles result in substantial quantities of coral reef imagery, however labeling these images is expensive and time-consuming for domain experts. Point label propagation is a technique that uses existing images labeled with sparse points to create augmented ground truth data, which can be used to train a semantic segmentation model. In this work, we show that recent advances in large foundation models facilitate the creation of augmented ground truth masks using only features extracted by the denoised version of the DINOv2 foundation model and K-Nearest Neighbors (KNN), without any pre-training. For images with extremely sparse labels, we present a labeling method based on human-in-the-loop principles, which greatly enhances annotation efficiency: in the case that there are 5 point labels per image, our human-in-the-loop method outperforms the prior state-of-the-art by 14.2% for pixel accuracy and 19.7% for mIoU; and by 8.9% and 18.3% if there are 10 point labels. When human-in-the-loop labeling is not available, using the denoised DINOv2 features with a KNN still improves on the prior state-of-the-art by 2.7% for pixel accuracy and 5.8% for mIoU (5 grid points). On the semantic segmentation task, we outperform the prior state-of-the-art by 8.8% for pixel accuracy and by 13.5% for mIoU when only 5 point labels are used for point label propagation. Additionally, we perform a comprehensive study into the impacts of the point label placement style and the number of points on the point label propagation quality, and make several recommendations for improving the efficiency of labeling images with points.

arxiv情報

著者	Scarlett Raine,Ross Marchant,Brano Kusy,Frederic Maire,Niko Suenderhauf,Tobias Fischer
発行日	2024-11-12 04:37:47+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Human-in-the-Loop Segmentation of Multi-species Coral Imagery

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー