RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation

要約

RiEMannは、SE(3)-Equivariant Robot Manipulationの模倣学習フレームワークである。記述子フィールドのマッチングに依存する従来の手法と比較して、RiEMannはオブジェクトのセグメンテーションを行うことなく、操作の対象となるオブジェクトのポーズを直接予測する。RiEMannは、5～10回のデモンストレーションにより、ゼロから操作タスクを学習し、未知のSE(3)変換やターゲットオブジェクトのインスタンスに汎化し、注意散漫なオブジェクトの視覚干渉に抵抗し、ターゲットオブジェクトのほぼリアルタイムの姿勢変化に追従する。RiEMannのスケーラブルなアクション空間は、蛇口を回す方向などのカスタム等変量アクションの追加を容易にし、RiEMannの多関節物体操作を可能にする。シミュレーションと実世界の6自由度ロボット操作実験において、RiEMannを5つのカテゴリの操作タスクと合計25のバリエーションでテストし、RiEMannがタスク成功率と予測ポーズのSE(3)測地距離誤差（68.6%減少）の両方でベースラインを上回り、5.4フレーム/秒（FPS）のネットワーク推論速度を達成することを示す。コードとビデオの結果はhttps://riemann-web.github.io/。

要約(オリジナル)

We present RiEMann, an end-to-end near Real-time SE(3)-Equivariant Robot Manipulation imitation learning framework from scene point cloud input. Compared to previous methods that rely on descriptor field matching, RiEMann directly predicts the target poses of objects for manipulation without any object segmentation. RiEMann learns a manipulation task from scratch with 5 to 10 demonstrations, generalizes to unseen SE(3) transformations and instances of target objects, resists visual interference of distracting objects, and follows the near real-time pose change of the target object. The scalable action space of RiEMann facilitates the addition of custom equivariant actions such as the direction of turning the faucet, which makes articulated object manipulation possible for RiEMann. In simulation and real-world 6-DOF robot manipulation experiments, we test RiEMann on 5 categories of manipulation tasks with a total of 25 variants and show that RiEMann outperforms baselines in both task success rates and SE(3) geodesic distance errors on predicted poses (reduced by 68.6%), and achieves a 5.4 frames per second (FPS) network inference speed. Code and video results are available at https://riemann-web.github.io/.

arxiv情報

著者	Chongkai Gao,Zhengrong Xue,Shuying Deng,Tianhai Liang,Siqi Yang,Lin Shao,Huazhe Xu
発行日	2024-10-03 11:13:29+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー