An End-to-End Network for Co-Saliency Detection in One Single Image

要約

単一の画像内での同時顕著性検出は、ほとんど注目されておらず、まだ十分に対処されていない一般的な視覚の問題です。
既存の方法では、多くの場合、ボトムアップ戦略を使用して、画像の共顕著性を推測しました。この場合、最初に色や形状などの視覚的プリミティブを使用して顕著な領域が検出され、次にグループ化されて共顕著性マップにマージされます。
しかし、共顕著性は本質的に、人間の視覚に組み込まれたボトムアップ戦略とトップダウン戦略で複雑に認識されます。
この問題に対処するために、この研究では、バックボーンネットと 2 つのブランチネットで構成される新しいエンドツーエンドのトレーニング可能なネットワークを提案します。
バックボーンネットは、顕著性予測のトップダウンガイダンスとしてグラウンドトゥルースマスクを使用しますが、2 つのブランチネットは、地域特徴マッピングとクラスタリングのためのトリプレット提案を構築します。
提案された方法を評価するために、各画像に共顕著性を持つ2,019個の自然画像の新しいデータセットを構築します。
実験結果は、提案された方法が 28 fps の実行速度で最先端の精度を達成することを示しています。

要約(オリジナル)

Co-saliency detection within a single image is a common vision problem that has received little attention and has not yet been well addressed. Existing methods often used a bottom-up strategy to infer co-saliency in an image in which salient regions are firstly detected using visual primitives such as color and shape and then grouped and merged into a co-saliency map. However, co-saliency is intrinsically perceived complexly with bottom-up and top-down strategies combined in human vision. To address this problem, this study proposes a novel end-to-end trainable network comprising a backbone net and two branch nets. The backbone net uses ground-truth masks as top-down guidance for saliency prediction, whereas the two branch nets construct triplet proposals for regional feature mapping and clustering, which drives the network to be bottom-up sensitive to co-salient regions. We construct a new dataset of 2,019 natural images with co-saliency in each image to evaluate the proposed method. Experimental results show that the proposed method achieves state-of-the-art accuracy with a running speed of 28 fps.

arxiv情報

著者	Yuanhao Yue,Qin Zou,Hongkai Yu,Qian Wang,Zhongyuan Wang,Song Wang
発行日	2023-02-15 15:17:28+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

An End-to-End Network for Co-Saliency Detection in One Single Image

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー