Online Robot Navigation and and Manipulation with Distilled Vision-Language Models

要約

動的な未知の環境内での自律ロボットナビゲーションは、ラストワンマイル配送におけるロボットナビゲーションや、産業および病院配送アプリケーションにおけるロボット対応の自動供給などのモバイルロボットアプリケーションにとって非常に重要です。
現在のソリューションには、ロボットが未知の物体をリアルタイムで認識できない、動的で狭く複雑な環境内を自由に移動できないなどの制限がまだあります。
私たちは、非常に密集した障害物や密集した人間の群衆内での自律ロボットの認識とナビゲーションのための完全なソフトウェアフレームワークを提案します。
まず、オープンワールドのオブジェクトカテゴリをゼロショット方式で正確に検出してセグメント化するフレームワークを提案します。これにより、現在の SAM モデルの過剰なセグメント化の制限が克服されます。
次に、ラベルなしでロボットナビゲーション用の歩道の自由空間をセグメント化するための知識を抽出するための抽出戦略を提案しました。
その一方で、自律航行中に NVIDIA-TX2 や Xavier NX などのエッジデバイスにニューラルネットワークを展開するための軽量推論を可能にするために、蒸留と連携して機能するトリミング戦略を設計します。
ロボットナビゲーションシステムに統合された広範な実験により、私たちが提案したフレームワークがロボットシーンの認識と自律ロボットナビゲーションの精度と効率の両方の点で優れたパフォーマンスを達成していることが実証されました。

要約(オリジナル)

Autonomous robot navigation within the dynamic unknown environment is of crucial significance for mobile robotic applications including robot navigation in last-mile delivery and robot-enabled automated supplies in industrial and hospital delivery applications. Current solutions still suffer from limitations, such as the robot cannot recognize unknown objects in real time and cannot navigate freely in a dynamic, narrow, and complex environment. We propose a complete software framework for autonomous robot perception and navigation within very dense obstacles and dense human crowds. First, we propose a framework that accurately detects and segments open-world object categories in a zero-shot manner, which overcomes the over-segmentation limitation of the current SAM model. Second, we proposed the distillation strategy to distill the knowledge to segment the free space of the walkway for robot navigation without the label. In the meantime, we design the trimming strategy that works collaboratively with distillation to enable lightweight inference to deploy the neural network on edge devices such as NVIDIA-TX2 or Xavier NX during autonomous navigation. Integrated into the robot navigation system, extensive experiments demonstrate that our proposed framework has achieved superior performance in terms of both accuracy and efficiency in robot scene perception and autonomous robot navigation.

arxiv情報

著者	Kangcheng Liu,Xinhu Zheng,Chaoqun Wang,Hesheng Wang,Ming Liu,Kai Tang
発行日	2024-01-30 15:05:22+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Online Robot Navigation and and Manipulation with Distilled Vision-Language Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー