3D-Aware Instance Segmentation and Tracking in Egocentric Videos

要約

自己中心的なビデオでは、急速なカメラの動き、頻繁なオブジェクトの遮蔽、およびオブジェクトの可視性の制限により、3D シーンの理解に特有の課題が生じます。
このペーパーでは、3D 認識を活用してこれらの障害を克服する、一人称ビデオでのインスタンスのセグメンテーションと追跡に対する新しいアプローチを紹介します。
私たちの手法は、シーンのジオメトリ、3D オブジェクトの重心追跡、およびインスタンスのセグメンテーションを統合して、動的な自己中心的なシーンを分析するための堅牢なフレームワークを作成します。
空間的および時間的手がかりを組み込むことにより、最先端の 2D アプローチと比較して優れたパフォーマンスを実現します。
困難な EPIC Fields データセットに対する広範な評価により、さまざまな追跡およびセグメンテーションの一貫性メトリクスにわたって大幅な改善が実証されました。
具体的には、私たちの方法は、関連付け精度 (AssA) で $7$ ポイント、IDF1 スコアで $4.5$ ポイント、次に最もパフォーマンスの高いアプローチを上回り、さまざまなオブジェクトカテゴリにわたって ID スイッチの数を $73\%$ から $80\%$ 削減します。
追跡されたインスタンスのセグメンテーションを活用して、これらの自己中心的な設定での 3D オブジェクトの再構成とアモーダルビデオオブジェクトのセグメンテーションにおけるダウンストリームアプリケーションを紹介します。

要約(オリジナル)

Egocentric videos present unique challenges for 3D scene understanding due to rapid camera motion, frequent object occlusions, and limited object visibility. This paper introduces a novel approach to instance segmentation and tracking in first-person video that leverages 3D awareness to overcome these obstacles. Our method integrates scene geometry, 3D object centroid tracking, and instance segmentation to create a robust framework for analyzing dynamic egocentric scenes. By incorporating spatial and temporal cues, we achieve superior performance compared to state-of-the-art 2D approaches. Extensive evaluations on the challenging EPIC Fields dataset demonstrate significant improvements across a range of tracking and segmentation consistency metrics. Specifically, our method outperforms the next best performing approach by $7$ points in Association Accuracy (AssA) and $4.5$ points in IDF1 score, while reducing the number of ID switches by $73\%$ to $80\%$ across various object categories. Leveraging our tracked instance segmentations, we showcase downstream applications in 3D object reconstruction and amodal video object segmentation in these egocentric settings.

arxiv情報

著者	Yash Bhalgat,Vadim Tschernezki,Iro Laina,João F. Henriques,Andrea Vedaldi,Andrew Zisserman
発行日	2024-11-20 12:51:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

3D-Aware Instance Segmentation and Tracking in Egocentric Videos

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー