Q-PETR: Quant-aware Position Embedding Transformation for Multi-View 3D Object Detection

要約

カメラベースのマルチビュー3D検出は、低コストと幅広い適用性により、自律運転の魅力的なソリューションとして浮上しています。
ただし、3D認識ベンチマークでのPETRベースのメソッドの強力なパフォーマンスにもかかわらず、オンボード展開の直接的なINT8量子化により、MAPで58.2％、NDSで36.9％に劇的な精度が低下します。
この作業では、PETRフレームワークの重要なコンポーネントを再設計し、位置エンコーディングと画像機能の動的範囲間の矛盾を調整し、低ビットの推論のためのクロスアテナンスメカニズムを適応させるために、PETRフレームワークの重要なコンポーネントを埋め込む量子化を意識した位置であるQ-PERTを提案します。
位置エンコーディングモジュールを再設計し、適応量子化戦略を導入することにより、Q-PETRは、トレーニング後の標準8ビットの標準8ビットで1％未満のパフォーマンス分解で浮動小数点パフォーマンスを維持します。
さらに、FP32のカウンターパートと比較して、Q-PETRは2倍のスピードアップを達成し、メモリ使用量を3回削減し、リソースに制約のあるオンボードデバイス用の展開に優しいソリューションを提供します。
さまざまなPETRシリーズモデルにわたる広範な実験は、アプローチの強力な一般化と実用的な利点を検証します。

要約(オリジナル)

Camera-based multi-view 3D detection has emerged as an attractive solution for autonomous driving due to its low cost and broad applicability. However, despite the strong performance of PETR-based methods in 3D perception benchmarks, their direct INT8 quantization for onboard deployment leads to drastic accuracy drops-up to 58.2% in mAP and 36.9% in NDS on the NuScenes dataset. In this work, we propose Q-PETR, a quantization-aware position embedding transformation that re-engineers key components of the PETR framework to reconcile the discrepancy between the dynamic ranges of positional encodings and image features, and to adapt the cross-attention mechanism for low-bit inference. By redesigning the positional encoding module and introducing an adaptive quantization strategy, Q-PETR maintains floating-point performance with a performance degradation of less than 1% under standard 8-bit per-tensor post-training quantization. Moreover, compared to its FP32 counterpart, Q-PETR achieves a two-fold speedup and reduces memory usage by three times, thereby offering a deployment-friendly solution for resource-constrained onboard devices. Extensive experiments across various PETR-series models validate the strong generalization and practical benefits of our approach.

arxiv情報

著者	Jiangyong Yu,Changyong Shu,Dawei Yang,Sifan Zhou,Zichen Yu,Xing Hu,Yan Chen
発行日	2025-03-11 15:05:41+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Q-PETR: Quant-aware Position Embedding Transformation for Multi-View 3D Object Detection

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー