SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation

要約

ロボット学習は、マニピュレータをプログラミングするための一般的かつ効果的な手法であることが証明されています。
模倣学習は人間のデモンストレーションのみからロボットに教えることができますが、デモンストレーションの機能がボトルネックとなります。
強化学習では、探索を使用してより良い行動を発見します。
ただし、ゼロから始めるには改善の余地が大きすぎる可能性があります。
どちらの手法でも、学習の難易度は操作タスクの長さに比例して増加します。
これを考慮して、最初にタスクとモーションプランニング (TAMP) を使用してタスクを小さな学習サブ問題に分解し、次に模倣学習と強化学習を組み合わせてその強みを最大化するシステムである SPIRE を提案します。
私たちは、計画システムのコンテキストで展開されたときに学習エージェントをトレーニングするための新しい戦略を開発します。
私たちは、長期にわたる接触の多い一連のロボット操作問題に関して SPIRE を評価します。
SPIRE は、模倣学習、強化学習、計画を統合する従来のアプローチよりも平均タスクパフォーマンスで 35% ～ 50% 優れており、熟練したエージェントをトレーニングするために必要な人間によるデモンストレーションの数では 6 倍のデータ効率があり、学習を完了することができることがわかりました。
タスクの効率がほぼ 2 倍になります。
詳細については、https://sites.google.com/view/spire-corl-2024 をご覧ください。

要約(オリジナル)

Robot learning has proven to be a general and effective technique for programming manipulators. Imitation learning is able to teach robots solely from human demonstrations but is bottlenecked by the capabilities of the demonstrations. Reinforcement learning uses exploration to discover better behaviors; however, the space of possible improvements can be too large to start from scratch. And for both techniques, the learning difficulty increases proportional to the length of the manipulation task. Accounting for this, we propose SPIRE, a system that first uses Task and Motion Planning (TAMP) to decompose tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths. We develop novel strategies to train learning agents when deployed in the context of a planning system. We evaluate SPIRE on a suite of long-horizon and contact-rich robot manipulation problems. We find that SPIRE outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance, is 6 times more data efficient in the number of human demonstrations needed to train proficient agents, and learns to complete tasks nearly twice as efficiently. View https://sites.google.com/view/spire-corl-2024 for more details.

arxiv情報

著者	Zihan Zhou,Animesh Garg,Dieter Fox,Caelan Garrett,Ajay Mandlekar
発行日	2024-10-23 17:42:07+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー