Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping

要約

【タイトル】リラックスした共通運命と視覚的グルーピングによるビデオからのオブジェクト性のブートストラップ

【要約】

– ラベル付けのされていないビデオからオブジェクト分割を学習することを研究している。
– 人間は何であるかを知らなくても動くオブジェクトを簡単にセグメンテーションすることができる。
– 共通運命というゲシュタルトの法則、つまり同じ速度で動くものは一緒に属する、は動きのセグメンテーションに基づく自己監督学習に影響を与えてきた。しかし、共通運命はオブジェクト性の信頼性が高くない。組み立てられた/変形可能なオブジェクトの一部は同じ速度で動かないかもしれず、一方、オブジェクトの影/反射は常にそれと一緒に動くが、それらはオブジェクトの一部ではない。
– 私たちの考えは、まずリラックスした共通運命から画像特徴を学習してから、画像自体および統計的に画像を跨いで視覚的な外観グルーピングに基づいてそれらを洗練することにより、オブジェクトの性質にブートストラップすることです。
– 具体的には、定数セグメント流による光学フローの近似ループとセグメント内残余フローの小さな変化による方法でまず画像セグメンターを学習し、より一貫した外観と統計的な前景と背景の関係に洗練することでそれを発展させます。
– ResNetと畳み込みヘッドのみを使用して、ラベル付けのされていないビデオオブジェクト分割では、DAVIS16 / STv2 / FBMS59において、当社のモデルは最新の技術を超越して7/9/5％の絶対的な利益を上げました。当社のコードは公開されています。

要約(オリジナル)

We study learning object segmentation from unlabeled videos. Humans can easily segment moving objects without knowing what they are. The Gestalt law of common fate, i.e., what move at the same speed belong together, has inspired unsupervised object discovery based on motion segmentation. However, common fate is not a reliable indicator of objectness: Parts of an articulated / deformable object may not move at the same speed, whereas shadows / reflections of an object always move with it but are not part of it. Our insight is to bootstrap objectness by first learning image features from relaxed common fate and then refining them based on visual appearance grouping within the image itself and across images statistically. Specifically, we learn an image segmenter first in the loop of approximating optical flow with constant segment flow plus small within-segment residual flow, and then by refining it for more coherent appearance and statistical figure-ground relevance. On unsupervised video object segmentation, using only ResNet and convolutional heads, our model surpasses the state-of-the-art by absolute gains of 7/9/5% on DAVIS16 / STv2 / FBMS59 respectively, demonstrating the effectiveness of our ideas. Our code is publicly available.

arxiv情報

著者	Long Lian,Zhirong Wu,Stella X. Yu
発行日	2023-04-17 07:18:21+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, OpenAI

Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー