PanoVOS:Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation

要約

パノラマ動画にはより豊かな空間情報が含まれており、自動運転や仮想現実などの一部の分野での優れた経験により多大な注目を集めています。
ただし、ビデオセグメンテーション用の既存のデータセットは、従来の平面画像のみに焦点を当てています。
この課題に対処するために、このペーパーではパノラマビデオデータセットである PanoVOS を紹介します。
このデータセットは、高解像度のビデオと多様なモーションを備えた 150 のビデオを提供します。
2D 平面ビデオとパノラマビデオ間のドメインギャップを定量化するために、PanoVOS で 15 の既製のビデオオブジェクトセグメンテーション (VOS) モデルを評価しました。
エラー分析を通じて、それらのすべてがパノラマビデオのピクセルレベルのコンテンツの中断に対処できていないことがわかりました。
したがって、我々は、現在のフレームとのピクセルレベルのマッチングのために前のフレームの意味論的境界情報を効果的に利用できるパノラマ空間整合性トランスフォーマ（PSCFormer）を提案する。
広範な実験により、以前の SOTA モデルと比較して、当社の PSCFormer ネットワークがパノラマ設定でのセグメンテーション結果の点で大きな利点を示すことが実証されました。
私たちのデータセットはパノラマ VOS に新たな課題をもたらしており、PanoVOS がパノラマセグメンテーション/トラッキングの開発を前進させることができることを期待しています。

要約(オリジナル)

Panoramic videos contain richer spatial information and have attracted tremendous amounts of attention due to their exceptional experience in some fields such as autonomous driving and virtual reality. However, existing datasets for video segmentation only focus on conventional planar images. To address the challenge, in this paper, we present a panoramic video dataset, PanoVOS. The dataset provides 150 videos with high video resolutions and diverse motions. To quantify the domain gap between 2D planar videos and panoramic videos, we evaluate 15 off-the-shelf video object segmentation (VOS) models on PanoVOS. Through error analysis, we found that all of them fail to tackle pixel-level content discontinues of panoramic videos. Thus, we present a Panoramic Space Consistency Transformer (PSCFormer), which can effectively utilize the semantic boundary information of the previous frame for pixel-level matching with the current frame. Extensive experiments demonstrate that compared with the previous SOTA models, our PSCFormer network exhibits a great advantage in terms of segmentation results under the panoramic setting. Our dataset poses new challenges in panoramic VOS and we hope that our PanoVOS can advance the development of panoramic segmentation/tracking.

arxiv情報

著者	Shilin Yan,Xiaohao Xu,Lingyi Hong,Wenchao Chen,Wenqiang Zhang,Wei Zhang
発行日	2023-09-21 17:59:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

PanoVOS:Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー