ForceSight: Text-Guided Mobile Manipulation with Visual-Force Goals

要約

我々は、ディープニューラルネットワークを使用して視覚力の目標を予測する、テキストガイドによるモバイル操作のためのシステムであるForceSightを紹介します。
単一の RGBD 画像とテキストプロンプトを組み合わせると、ForceSight はカメラフレーム内のターゲットエンドエフェクターのポーズ (運動学的な目標) と関連する力 (力の目標) を決定します。
これら 2 つのコンポーネントが一緒になって、視覚力の目標を形成します。
これまでの研究では、人間が解釈できる運動学的目標を出力するディープモデルにより、実際のロボットによる器用な操作が可能になることが実証されています。
力は操作にとって重要ですが、これらのシステムでは通常、下位レベルの実行に追いやられてきました。
アイ・イン・ハンド RGBD カメラを備えたモバイルマニピュレータに導入した場合、ForceSight は、トレーニングデータとは大きく異なるオブジェクトインスタンスがある目に見えない環境で、精密な把握、引き出しの開閉、オブジェクトの引き渡しなどのタスクを 81% の成功率で実行しました。
。
別の実験では、視覚的なサーボのみに依存し、力の目標を無視すると、成功率が 90% から 45% に低下し、力の目標によってパフォーマンスが大幅に向上することが実証されました。
付録、ビデオ、コード、トレーニング済みモデルは https://force-sight.github.io/ で入手できます。

要約(オリジナル)

We present ForceSight, a system for text-guided mobile manipulation that predicts visual-force goals using a deep neural network. Given a single RGBD image combined with a text prompt, ForceSight determines a target end-effector pose in the camera frame (kinematic goal) and the associated forces (force goal). Together, these two components form a visual-force goal. Prior work has demonstrated that deep models outputting human-interpretable kinematic goals can enable dexterous manipulation by real robots. Forces are critical to manipulation, yet have typically been relegated to lower-level execution in these systems. When deployed on a mobile manipulator equipped with an eye-in-hand RGBD camera, ForceSight performed tasks such as precision grasps, drawer opening, and object handovers with an 81% success rate in unseen environments with object instances that differed significantly from the training data. In a separate experiment, relying exclusively on visual servoing and ignoring force goals dropped the success rate from 90% to 45%, demonstrating that force goals can significantly enhance performance. The appendix, videos, code, and trained models are available at https://force-sight.github.io/.

arxiv情報

著者	Jeremy A. Collins,Cody Houff,You Liang Tan,Charles C. Kemp
発行日	2023-09-21 17:59:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

ForceSight: Text-Guided Mobile Manipulation with Visual-Force Goals

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー