Out of the Room: Generalizing Event-Based Dynamic Motion Segmentation for Complex Scenes

要約

モーションセグメンテーションとも呼ばれる、動的シーンパーツの迅速かつ信頼性の高い識別は、モバイルセンサーにとって重要な課題です。
現代の RGB カメラベースの手法は、カメラとシーンのプロパティのモデリングに依存していますが、制約が不十分なことが多く、未知のカテゴリでは不十分です。
イベントカメラにはこれらの制限を克服する可能性がありますが、対応する方法は、単純化された動的オブジェクトを使用した小規模な屋内環境でのみ実証されています。
この研究では、複雑な大規模な屋外環境にもうまく導入できる、クラスに依存しないモーションセグメンテーションのためのイベントベースの方法を提示します。
この目的を達成するために、(a) 補助タスクとして単眼の奥行きとカメラのポーズを予測するシーン理解モジュールを介して計算されるエゴモーション補償イベントと、(b) からのオプティカルフローを組み合わせた新しい分割統治パイプラインを導入します。
専用のオプティカルフローモジュール。
これらの中間表現は、モーションセグメンテーションマスクを予測するセグメンテーションモジュールに供給されます。
セグメンテーションモジュール内の新しいトランスフォーマーベースの時間的アテンションモジュールは、隣接する「フレーム」間の相関関係を構築し、時間的に一貫したセグメンテーションマスクを取得します。
私たちの手法は、古典的な EV-IMO ベンチマーク (屋内) で新しい最先端を確立し、それぞれ 2.19 移動物体 IoU (2.22 mIoU) と 4.52 ポイント IoU の改善を達成しました。また、新たに生成されたベンチマークでも、
DSEC-MOTS と呼ばれる DSEC イベントデータセットに基づくモーションセグメンテーションおよび追跡ベンチマーク (屋外) では、12.91 移動オブジェクト IoU の向上が示されています。

要約(オリジナル)

Rapid and reliable identification of dynamic scene parts, also known as motion segmentation, is a key challenge for mobile sensors. Contemporary RGB camera-based methods rely on modeling camera and scene properties however, are often under-constrained and fall short in unknown categories. Event cameras have the potential to overcome these limitations, but corresponding methods have only been demonstrated in smaller-scale indoor environments with simplified dynamic objects. This work presents an event-based method for class-agnostic motion segmentation that can successfully be deployed across complex large-scale outdoor environments too. To this end, we introduce a novel divide-and-conquer pipeline that combines: (a) ego-motion compensated events, computed via a scene understanding module that predicts monocular depth and camera pose as auxiliary tasks, and (b) optical flow from a dedicated optical flow module. These intermediate representations are then fed into a segmentation module that predicts motion segmentation masks. A novel transformer-based temporal attention module in the segmentation module builds correlations across adjacent ‘frames’ to get temporally consistent segmentation masks. Our method sets the new state-of-the-art on the classic EV-IMO benchmark (indoors), where we achieve improvements of 2.19 moving object IoU (2.22 mIoU) and 4.52 point IoU respectively, as well as on a newly-generated motion segmentation and tracking benchmark (outdoors) based on the DSEC event dataset, termed DSEC-MOTS, where we show improvement of 12.91 moving object IoU.

arxiv情報

著者	Stamatios Georgoulis,Weining Ren,Alfredo Bochicchio,Daniel Eckert,Yuanyou Li,Abel Gawel
発行日	2024-03-07 14:59:34+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Out of the Room: Generalizing Event-Based Dynamic Motion Segmentation for Complex Scenes

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー