RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical Flow and Scene Flow Estimation

要約

最近、2D オプティカルフローと 3D シーンフローを共同推定するために、RGB 画像と点群融合法が提案されています。
ただし、従来の RGB カメラと LiDAR センサーは両方ともフレームベースのデータ取得メカニズムを採用しているため、特に非常に動的なシーンでは、固定された低いサンプリングレートによってパフォーマンスが制限されます。
対照的に、イベントカメラは、非常に高い時間分解能で強度の変化を非同期的にキャプチャでき、観察されたシーンの補完的な動的情報を提供します。
この論文では、私たちが提案する多段階マルチモーダル融合モデル RPEFlow を使用して、オプティカルフローとシーンフローを統合して推定するために、RGB 画像、点群、およびイベントを組み込みます。
まず、2D 分岐と 3D 分岐の内部クロスモーダル相関をそれぞれ暗黙的に探索するためのクロスアテンションメカニズムを備えたアテンションフュージョンモジュールを紹介します。
次に、相互情報量正則化項を導入して、効果的なマルチモーダル特徴学習のために 3 つのモダリティの相補情報を明示的にモデル化します。
また、さらなる研究を推進するために、新しい合成データセットも提供します。
合成データセットと実際のデータセットの両方での実験では、私たちのモデルが既存の最先端のモデルを大幅に上回るパフォーマンスを示しています。
コードとデータセットは https://npucvr.github.io/RPEFlow で入手できます。

要約(オリジナル)

Recently, the RGB images and point clouds fusion methods have been proposed to jointly estimate 2D optical flow and 3D scene flow. However, as both conventional RGB cameras and LiDAR sensors adopt a frame-based data acquisition mechanism, their performance is limited by the fixed low sampling rates, especially in highly-dynamic scenes. By contrast, the event camera can asynchronously capture the intensity changes with a very high temporal resolution, providing complementary dynamic information of the observed scenes. In this paper, we incorporate RGB images, Point clouds and Events for joint optical flow and scene flow estimation with our proposed multi-stage multimodal fusion model, RPEFlow. First, we present an attention fusion module with a cross-attention mechanism to implicitly explore the internal cross-modal correlation for 2D and 3D branches, respectively. Second, we introduce a mutual information regularization term to explicitly model the complementary information of three modalities for effective multimodal feature learning. We also contribute a new synthetic dataset to advocate further research. Experiments on both synthetic and real datasets show that our model outperforms the existing state-of-the-art by a wide margin. Code and dataset is available at https://npucvr.github.io/RPEFlow.

arxiv情報

著者	Zhexiong Wan,Yuxin Mao,Jing Zhang,Yuchao Dai
発行日	2023-09-26 17:23:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical Flow and Scene Flow Estimation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー