The Art of Imitation: Learning Long-Horizon Manipulation Tasks from Few Demonstrations

要約

タスクパラメーター化ガウス混合モデル (TP-GMM) は、オブジェクト中心のロボット操作タスクを学習するためのサンプル効率の高い方法です。
ただし、TP-GMM を実際に適用するには、いくつかの未解決の課題があります。
この取り組みでは、3 つの重要な課題に相乗的に取り組みます。
まず、エンドエフェクターの速度は非ユークリッドであるため、標準的な GMM を使用してモデル化するのは困難です。
したがって、ロボットのエンドエフェクター速度を方向と大きさに因数分解し、リーマン GMM を使用してモデル化することを提案します。
2 番目に、因数分解された速度を活用して、複雑なデモンストレーションの軌跡からスキルをセグメント化して順序付けします。
セグメンテーションを通じて、スキルの軌道をさらに調整し、時間を強力な誘導バイアスとして活用します。
第三に、視覚的観察からスキルごとに関連するタスクパラメータを自動的に検出する方法を紹介します。
私たちのアプローチでは、RGB-D 観察のみを使用しながら、わずか 5 つのデモンストレーションから複雑な操作タスクを学習できます。
RLBench での広範な実験評価により、当社のアプローチがサンプル効率を 20 倍向上させて最先端のパフォーマンスを達成できることが実証されました。
私たちのポリシーは、さまざまな環境、オブジェクトインスタンス、オブジェクトの位置にわたって一般化され、学習したスキルは再利用可能です。

要約(オリジナル)

Task Parametrized Gaussian Mixture Models (TP-GMM) are a sample-efficient method for learning object-centric robot manipulation tasks. However, there are several open challenges to applying TP-GMMs in the wild. In this work, we tackle three crucial challenges synergistically. First, end-effector velocities are non-Euclidean and thus hard to model using standard GMMs. We thus propose to factorize the robot’s end-effector velocity into its direction and magnitude, and model them using Riemannian GMMs. Second, we leverage the factorized velocities to segment and sequence skills from complex demonstration trajectories. Through the segmentation, we further align skill trajectories and hence leverage time as a powerful inductive bias. Third, we present a method to automatically detect relevant task parameters per skill from visual observations. Our approach enables learning complex manipulation tasks from just five demonstrations while using only RGB-D observations. Extensive experimental evaluations on RLBench demonstrate that our approach achieves state-of-the-art performance with 20-fold improved sample efficiency. Our policies generalize across different environments, object instances, and object positions, while the learned skills are reusable.

arxiv情報

著者	Jan Ole von Hartz,Tim Welschehold,Abhinav Valada,Joschka Boedecker
発行日	2024-10-21 09:12:21+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

The Art of Imitation: Learning Long-Horizon Manipulation Tasks from Few Demonstrations

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー