Grasp, See, and Place: Efficient Unknown Object Rearrangement with Policy Structure Prior


私たちは未知のオブジェクトの再配置のタスクに焦点を当てます。このタスクでは、ロボットがオブジェクトを RGB-D 画像で指定された望ましい目標構成に再構成することになっています。
オブジェクトのマッチング、ポリシーの学習、および自己終了には基盤モデル CLIP を活用します。
一連の実験により、GSP はより高い完了率とより少ない手順で未知のオブジェクトの再配置を実行できることが示されました。


We focus on the task of unknown object rearrangement, where a robot is supposed to re-configure the objects into a desired goal configuration specified by an RGB-D image. Recent works explore unknown object rearrangement systems by incorporating learning-based perception modules. However, they are sensitive to perception error, and pay less attention to task-level performance. In this paper, we aim to develop an effective system for unknown object rearrangement amidst perception noise. We theoretically reveal that the noisy perception impacts grasp and place in a decoupled way, and show such a decoupled structure is valuable to improve task optimality. We propose GSP, a dual-loop system with the decoupled structure as prior. For the inner loop, we learn a see policy for self-confident in-hand object matching. For the outer loop, we learn a grasp policy aware of object matching and grasp capability guided by task-level rewards. We leverage the foundation model CLIP for object matching, policy learning and self-termination. A series of experiments indicate that GSP can conduct unknown object rearrangement with higher completion rates and fewer steps.


著者 Kechun Xu,Zhongxiang Zhou,Jun Wu,Haojian Lu,Rong Xiong,Yue Wang
発行日 2025-01-06 03:42:49+00:00
arxivサイト arxiv_id(pdf)

提供元, 利用サービス, Google

カテゴリー: cs.LG, cs.RO パーマリンク