MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

要約

正確で信頼性の高い3D検出は、自律走行車やサービスロボットを含む多くのアプリケーションに不可欠である。本論文では、点群シーケンスを用いた3D時間オブジェクト検出のための、MPPNetと名付けられた柔軟で高性能な3D検出フレームワークを紹介する。我々は、より良い検出を達成するために、マルチフレーム特徴量のエンコードと相互作用のための代理点を持つ新しい3階層フレームワークを提案する。3つの階層はそれぞれ、フレーム単位の特徴量エンコーディング、ショートクリップの特徴量融合、シーケンス全体の特徴量集約を行う。長時間の点群データを合理的な計算資源で処理するために、グループ内特徴量混合とグループ間特徴量注視を提案し、第2、第3特徴量エンコーディング階層を形成し、マルチフレーム軌跡特徴量の集約に再帰的に適用する。代理点は、各フレームの一貫したオブジェクト表現として機能するだけでなく、フレーム間の特徴相互作用を促進するための運び屋としても機能する。大規模なWaymo Openデータセットでの実験では、短い点群シーケンス（例えば4フレーム）と長い点群シーケンス（例えば16フレーム）の両方に適用した場合、我々のアプローチが大きなマージンをもって最先端手法を上回ることを示している。コードは https://github.com/open-mmlab/OpenPCDet で公開されています。

要約(オリジナル)

Accurate and reliable 3D detection is vital for many applications including autonomous driving vehicles and service robots. In this paper, we present a flexible and high-performance 3D detection framework, named MPPNet, for 3D temporal object detection with point cloud sequences. We propose a novel three-hierarchy framework with proxy points for multi-frame feature encoding and interactions to achieve better detection. The three hierarchies conduct per-frame feature encoding, short-clip feature fusion, and whole-sequence feature aggregation, respectively. To enable processing long-sequence point clouds with reasonable computational resources, intra-group feature mixing and inter-group feature attention are proposed to form the second and third feature encoding hierarchies, which are recurrently applied for aggregating multi-frame trajectory features. The proxy points not only act as consistent object representations for each frame, but also serve as the courier to facilitate feature interaction between frames. The experiments on large Waymo Open dataset show that our approach outperforms state-of-the-art methods with large margins when applied to both short (e.g., 4-frame) and long (e.g., 16-frame) point cloud sequences. Code is available at https://github.com/open-mmlab/OpenPCDet.

arxiv情報

著者	Xuesong Chen,Shaoshuai Shi,Benjin Zhu,Ka Chun Cheung,Hang Xu,Hongsheng Li
発行日	2022-09-02 15:08:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー