M-FUSE: Multi-frame Fusion for Scene Flow Estimation

要約

最近、シーンフロー推定用のニューラルネットワークは、KITTIベンチマークなどの自動車データで印象的な結果を示しています。
ただし、高度な剛性の仮定とパラメータ化を使用しているにもかかわらず、このようなネットワークは通常、時間情報を利用できない2つのフレームペアのみに制限されています。
私たちの論文では、追加の先行ステレオペアを考慮した新しいマルチフレームアプローチを提案することにより、この欠点に対処します。
この目的のために、2つのステップで進めます。最初に、最近のRAFT-3Dアプローチに基づいて、改良されたステレオ方式を組み込むことにより、高度な2フレームベースラインを開発します。
次に、さらに重要なことに、RAFT-3Dの特定のモデリング概念を活用して、順方向と逆方向のフロー推定の融合を実行し、時間情報をオンデマンドで統合できるU-Netのようなアーキテクチャを提案します。
KITTIベンチマークでの実験は、改善されたベースラインと時間的融合アプローチの利点が互いに補完し合うことを示すだけでなく、計算されたシーンフローが非常に正確であることも示しています。
より正確には、私たちのアプローチは、全体で2番目にランク付けされ、さらに困難な前景オブジェクトで1番目にランク付けされ、元のRAFT-3Dメソッドを合計で16％以上上回っています。
コードはhttps://github.com/cv-stuttgart/M-FUSEで入手できます。

要約(オリジナル)

Recently, neural network for scene flow estimation show impressive results on automotive data such as the KITTI benchmark. However, despite of using sophisticated rigidity assumptions and parametrizations, such networks are typically limited to only two frame pairs which does not allow them to exploit temporal information. In our paper we address this shortcoming by proposing a novel multi-frame approach that considers an additional preceding stereo pair. To this end, we proceed in two steps: Firstly, building upon the recent RAFT-3D approach, we develop an advanced two-frame baseline by incorporating an improved stereo method. Secondly, and even more importantly, exploiting the specific modeling concepts of RAFT-3D, we propose a U-Net like architecture that performs a fusion of forward and backward flow estimates and hence allows to integrate temporal information on demand. Experiments on the KITTI benchmark do not only show that the advantages of the improved baseline and the temporal fusion approach complement each other, they also demonstrate that the computed scene flow is highly accurate. More precisely, our approach ranks second overall and first for the even more challenging foreground objects, in total outperforming the original RAFT-3D method by more than 16%. Code is available at https://github.com/cv-stuttgart/M-FUSE.

arxiv情報

著者	Lukas Mehl,Azin Jahedi,Jenny Schmalfuss,Andrés Bruhn
発行日	2022-07-12 17:26:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

M-FUSE: Multi-frame Fusion for Scene Flow Estimation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー