IROAM: Improving Roadside Monocular 3D Object Detection Learning from Autonomous Vehicle Data Domain

要約

自律運転では、エゴベヒクルの知覚能力は、環境の全体的な見方を提供できる道端のセンサーで改善できます。
ただし、車両カメラ向けに設計された既存の単眼検知方法は、視点ドメインのギャップのために道端のカメラには適していません。
このギャップを埋めて、道端の単眼3Dオブジェクトの検出を改善するために、車両側と道端のデータを同時に入力するセマンティックジオメトリデカップされたコントラスト学習フレームワークであるIROAMを提案します。
IROAMには2つの重要なモジュールがあります。
ドメイン内クエリインタラクションモジュールは、各ドメインのコンテンツと深度情報を学習し、オブジェクトクエリを出力します。
クロスドメインクエリの強化2つのドメインからより良い機能表現を学習するためのクロスドメインクエリエンハンスメントは、クエリをセマンティックおよびジオメトリパーツに分離し、前者のみが対照学習に使用されます。
実験は、道端検出器の性能を向上させる際のIROAMの有効性を示しています。
結果は、IROAMにはクロスドメイン情報を学習する機能があることを検証しています。

要約(オリジナル)

In autonomous driving, The perception capabilities of the ego-vehicle can be improved with roadside sensors, which can provide a holistic view of the environment. However, existing monocular detection methods designed for vehicle cameras are not suitable for roadside cameras due to viewpoint domain gaps. To bridge this gap and Improve ROAdside Monocular 3D object detection, we propose IROAM, a semantic-geometry decoupled contrastive learning framework, which takes vehicle-side and roadside data as input simultaneously. IROAM has two significant modules. In-Domain Query Interaction module utilizes a transformer to learn content and depth information for each domain and outputs object queries. Cross-Domain Query Enhancement To learn better feature representations from two domains, Cross-Domain Query Enhancement decouples queries into semantic and geometry parts and only the former is used for contrastive learning. Experiments demonstrate the effectiveness of IROAM in improving roadside detector’s performance. The results validate that IROAM has the capabilities to learn cross-domain information.

arxiv情報

著者	Zhe Wang,Xiaoliang Huo,Siqi Fan,Jingjing Liu,Ya-Qin Zhang,Yan Wang
発行日	2025-01-30 06:10:23+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

IROAM: Improving Roadside Monocular 3D Object Detection Learning from Autonomous Vehicle Data Domain

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー