Unsupervised Conditional Slot Attention for Object Centric Learning

要約

下流の推論タスクのためにオブジェクトレベルの表現を抽出することは、AI の新たな分野です。
教師なし設定でオブジェクト中心の表現を学習することには複数の課題があり、その重要な課題は、任意の数のオブジェクトインスタンスを特殊なオブジェクトスロットにバインドすることです。
スロットアテンションなどの最近のオブジェクト中心の表現方法は、反復的なアテンションを利用して、動的な推論レベルバインディングで構成可能な表現を学習しますが、特殊化されたスロットレベルバインディングを実現できません。
これに対処するために、この論文では、新しい確率的スロット辞書 (PSD) を使用した教師なし条件付きスロットアテンションを提案します。
PSD は、(i) 抽象オブジェクトレベルのプロパティベクトルをキーとして、(ii) パラメトリックガウス分布を対応する値として定義します。
我々は、複数の下流タスク、すなわち、オブジェクト発見、構成シーン生成、および構成視覚推論における、学習された特定のオブジェクトレベルの条件付け分布の利点を実証します。
私たちの方法は、シーン構成機能を提供し、構成視覚的推論のいくつかのショット適応性タスクで大幅な向上をもたらし、同時に物体発見タスクではスロット注意と同等以上のパフォーマンスを発揮することを示します。

要約(オリジナル)

Extracting object-level representations for downstream reasoning tasks is an emerging area in AI. Learning object-centric representations in an unsupervised setting presents multiple challenges, a key one being binding an arbitrary number of object instances to a specialized object slot. Recent object-centric representation methods like Slot Attention utilize iterative attention to learn composable representations with dynamic inference level binding but fail to achieve specialized slot level binding. To address this, in this paper we propose Unsupervised Conditional Slot Attention using a novel Probabilistic Slot Dictionary (PSD). We define PSD with (i) abstract object-level property vectors as key and (ii) parametric Gaussian distribution as its corresponding value. We demonstrate the benefits of the learnt specific object-level conditioning distributions in multiple downstream tasks, namely object discovery, compositional scene generation, and compositional visual reasoning. We show that our method provides scene composition capabilities and a significant boost in a few shot adaptability tasks of compositional visual reasoning, while performing similarly or better than slot attention in object discovery tasks

arxiv情報

著者	Avinash Kori,Francesco Locatello,Francesca Toni,Ben Glocker
発行日	2023-07-18 17:11:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Unsupervised Conditional Slot Attention for Object Centric Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー