HAISTA-NET: Human Assisted Instance Segmentation Through Attention

要約

インスタンスセグメンテーションは、オブジェクトの詳細化、医療画像分析、画像/ビデオ編集など、さまざまな用途に使用できる画像検出の形式であり、いずれも高度な精度が要求されます。
ただし、この精度は、最先端の完全に自動化されたインスタンスセグメンテーションアルゴリズムでも実現できないことがよくあります。
小さくて複雑なオブジェクトの場合、パフォーマンスのギャップは特に法外なものになります。
実務者は通常、完全に手動で注釈を付けることに頼っていますが、これは面倒なプロセスになる可能性があります。
この問題を克服するために、より正確な予測を可能にし、曲率が高く、複雑で小規模なオブジェクトに対して高品質のセグメンテーションマスクを生成する新しいアプローチを提案します。
当社の人力支援セグメンテーションモデルである HAISTA-NET は、既存の Strong Mask R-CNN ネットワークを拡張して、人が指定した部分境界を組み込みます。
また、人間の注意マップと呼ばれる、手書きの部分的なオブジェクト境界のデータセットも提示します。
さらに、部分スケッチオブジェクト境界 (PSOB) データセットには、オブジェクトのグラウンドトゥルースマスクの曲率を複数のピクセルで表す手描きの部分オブジェクト境界が含まれています。
PSOB データセットを使用した広範な評価を通じて、HAISTA-NET が Mask R-CNN、Strong Mask R-CNN、Mask2Former などの最先端の手法を上回っており、それぞれ +36.7、+29.6、+ の増加を達成していることを示しています。
これら 3 つのモデルの AP-Mask メトリクスは 26.5 ポイントです。
私たちは、完全に自動化されたインタラクティブなインスタンスセグメンテーションアーキテクチャを組み合わせることにより、私たちの新しいアプローチが将来の人助けディープラーニングモデルのベースラインを設定することを願っています。

要約(オリジナル)

Instance segmentation is a form of image detection which has a range of applications, such as object refinement, medical image analysis, and image/video editing, all of which demand a high degree of accuracy. However, this precision is often beyond the reach of what even state-of-the-art, fully automated instance segmentation algorithms can deliver. The performance gap becomes particularly prohibitive for small and complex objects. Practitioners typically resort to fully manual annotation, which can be a laborious process. In order to overcome this problem, we propose a novel approach to enable more precise predictions and generate higher-quality segmentation masks for high-curvature, complex and small-scale objects. Our human-assisted segmentation model, HAISTA-NET, augments the existing Strong Mask R-CNN network to incorporate human-specified partial boundaries. We also present a dataset of hand-drawn partial object boundaries, which we refer to as human attention maps. In addition, the Partial Sketch Object Boundaries (PSOB) dataset contains hand-drawn partial object boundaries which represent curvatures of an object’s ground truth mask with several pixels. Through extensive evaluation using the PSOB dataset, we show that HAISTA-NET outperforms state-of-the art methods such as Mask R-CNN, Strong Mask R-CNN, and Mask2Former, achieving respective increases of +36.7, +29.6, and +26.5 points in AP-Mask metrics for these three models. We hope that our novel approach will set a baseline for future human-aided deep learning models by combining fully automated and interactive instance segmentation architectures.

arxiv情報

著者	Muhammed Korkmaz,T. Metin Sezgin
発行日	2024-03-08 13:30:58+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

HAISTA-NET: Human Assisted Instance Segmentation Through Attention

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー