NanoMVG: USV-Centric Low-Power Multi-Task Visual Grounding based on Prompt-Guided Camera and 4D mmWave Radar

要約

最近、視覚的接地とマルチセンサーの設定が地上自動運転システムや無人水上車両 (USV) の認識システムに組み込まれていますが、マルチセンサーを使用した最新の学習ベースの視覚的接地モデルは複雑であるため、そのようなモデルを導入することはできません。
現実の USV について。
この目的を達成するために、私たちは水路を体現した知覚用の NanoMVG という名前の低電力マルチタスクモデルを設計し、カメラと 4D ミリ波レーダーの両方を誘導して、自然言語を通じて特定のオブジェクトの位置を特定します。
NanoMVG は、ボックスレベルとマスクレベルの両方のビジュアルグラウンディングタスクを同時に実行できます。
他のビジュアルグラウンディングモデルと比較して、NanoMVG は、特に過酷な環境において、WaterVG データセット上で非常に競争力のあるパフォーマンスを達成し、長期耐久性のための超低消費電力を誇ります。

要約(オリジナル)

Recently, visual grounding and multi-sensors setting have been incorporated into perception system for terrestrial autonomous driving systems and Unmanned Surface Vehicles (USVs), yet the high complexity of modern learning-based visual grounding model using multi-sensors prevents such model to be deployed on USVs in the real-life. To this end, we design a low-power multi-task model named NanoMVG for waterway embodied perception, guiding both camera and 4D millimeter-wave radar to locate specific object(s) through natural language. NanoMVG can perform both box-level and mask-level visual grounding tasks simultaneously. Compared to other visual grounding models, NanoMVG achieves highly competitive performance on the WaterVG dataset, particularly in harsh environments and boasts ultra-low power consumption for long endurance.

arxiv情報

著者	Runwei Guan,Jianan Liu,Liye Jia,Haocheng Zhao,Shanliang Yao,Xiaohui Zhu,Ka Lok Man,Eng Gee Lim,Jeremy Smith,Yutao Yue
発行日	2024-08-30 11:22:09+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

NanoMVG: USV-Centric Low-Power Multi-Task Visual Grounding based on Prompt-Guided Camera and 4D mmWave Radar

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー