A Generalist Framework for Panoptic Segmentation of Images and Videos

要約

パノプティックセグメンテーションは、画像のすべてのピクセルにセマンティックラベルとインスタンス ID ラベルを割り当てます。
インスタンス ID の順列も有効なソリューションであるため、タスクには高次元の 1 対多マッピングの学習が必要です。
その結果、最先端のアプローチでは、カスタマイズされたアーキテクチャとタスク固有の損失関数が使用されます。
タスクの誘導バイアスに依存することなく、パノプティックセグメンテーションを個別のデータ生成問題として定式化します。
アナログビットに基づく拡散モデルを使用して、シンプルで汎用的なアーキテクチャと損失関数を備えたパノプティックマスクをモデル化します。
過去の予測を条件付け信号として追加するだけで、私たちの方法は (ストリーミング設定で) ビデオをモデル化することができ、それによってオブジェクトインスタンスを自動的に追跡することを学習します。
大規模な実験により、私たちのジェネラリストアプローチが、同様の設定で最先端の専門家の方法に匹敵するパフォーマンスを発揮できることを示しています。

要約(オリジナル)

Panoptic segmentation assigns semantic and instance ID labels to every pixel of an image. As permutations of instance IDs are also valid solutions, the task requires learning of high-dimensional one-to-many mapping. As a result, state-of-the-art approaches use customized architectures and task-specific loss functions. We formulate panoptic segmentation as a discrete data generation problem, without relying on inductive bias of the task. A diffusion model based on analog bits is used to model panoptic masks, with a simple, generic architecture and loss function. By simply adding past predictions as a conditioning signal, our method is capable of modeling video (in a streaming setting) and thereby learns to track object instances automatically. With extensive experiments, we demonstrate that our generalist approach can perform competitively to state-of-the-art specialist methods in similar settings.

arxiv情報

著者	Ting Chen,Lala Li,Saurabh Saxena,Geoffrey Hinton,David J. Fleet
発行日	2022-10-12 16:18:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

A Generalist Framework for Panoptic Segmentation of Images and Videos

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー