LiDAR-Camera Fusion for Video Panoptic Segmentation without Video Training

要約

インスタンスセグメンテーションとセマンティックセグメンテーションを組み合わせたパノプティックセグメンテーションは、シーンを包括的に表現できるため、自動運転車で多くの注目を集めています。
このタスクはカメラと LiDAR センサーに適用できますが、両方のセンサーを組み合わせて画像パノプティックセグメンテーション (PS) を強化することには限定的に焦点が当てられてきました。
以前の研究では、カメラベースのシーン認識における 3D データの利点は認められていましたが、画像およびビデオのパノプティックセグメンテーション (VPS) に対する 3D データの影響を調査した具体的な研究はありませんでした。この研究では、PS とビデオのパノプティックセグメンテーションを強化する機能融合モジュールの導入を目指しています。
LiDARと自動運転車の画像データを融合したVPS。
また、この融合に加えて、2 つの簡単な変更を利用する私たちの提案モデルは、ビデオデータでトレーニングすることなくさらに高品質の VPS をさらに提供できることも示します。
結果は、画像とビデオの両方のパノプティックセグメンテーション評価指標が最大 5 ポイント大幅に改善されたことを示しています。

要約(オリジナル)

Panoptic segmentation, which combines instance and semantic segmentation, has gained a lot of attention in autonomous vehicles, due to its comprehensive representation of the scene. This task can be applied for cameras and LiDAR sensors, but there has been a limited focus on combining both sensors to enhance image panoptic segmentation (PS). Although previous research has acknowledged the benefit of 3D data on camera-based scene perception, no specific study has explored the influence of 3D data on image and video panoptic segmentation (VPS).This work seeks to introduce a feature fusion module that enhances PS and VPS by fusing LiDAR and image data for autonomous vehicles. We also illustrate that, in addition to this fusion, our proposed model, which utilizes two simple modifications, can further deliver even more high-quality VPS without being trained on video data. The results demonstrate a substantial improvement in both the image and video panoptic segmentation evaluation metrics by up to 5 points.

arxiv情報

著者	Fardin Ayar,Ehsan Javanmardi,Manabu Tsukada,Mahdi Javanmardi,Mohammad Rahmati
発行日	2024-12-30 11:43:51+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

LiDAR-Camera Fusion for Video Panoptic Segmentation without Video Training

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー