SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling

要約

豊富なスキルセットを備えたロボットポリシーの事前トレーニングにより、下流タスクの学習を大幅に加速できます。
これまでの研究では、自然言語命令を介して事前トレーニングタスクを定義していましたが、それには何十万もの命令に対する人間による面倒な注釈が必要でした。
そこで、多様なスキルの事前トレーニングに必要な人的労力を大幅に削減する、スケーラブルなオフラインポリシーの事前トレーニングアプローチである SPRINT を提案します。
私たちの手法では、2 つの核となるアイデアを使用して、トレーニング前タスクの基本セットを自動的に拡張します。それは、大規模な言語モデルによる命令の再ラベル付けと、オフライン強化学習によるクロストラジェクトリスキルチェーンです。
その結果、SPRINT の事前トレーニングにより、ロボットはより豊富なスキルのレパートリーを身に付けることができます。
家庭用シミュレーターと実際のロボットのキッチン操作タスクでの実験結果は、SPRINT が以前の事前トレーニングアプローチよりも、新しい長期的なタスクの学習を大幅に高速化することを示しています。
ウェブサイトは https://clvrai.com/sprint です。

要約(オリジナル)

Pre-training robot policies with a rich set of skills can substantially accelerate the learning of downstream tasks. Prior works have defined pre-training tasks via natural language instructions, but doing so requires tedious human annotation of hundreds of thousands of instructions. Thus, we propose SPRINT, a scalable offline policy pre-training approach which substantially reduces the human effort needed for pre-training a diverse set of skills. Our method uses two core ideas to automatically expand a base set of pre-training tasks: instruction relabeling via large language models and cross-trajectory skill chaining through offline reinforcement learning. As a result, SPRINT pre-training equips robots with a much richer repertoire of skills. Experimental results in a household simulator and on a real robot kitchen manipulation task show that SPRINT leads to substantially faster learning of new long-horizon tasks than previous pre-training approaches. Website at https://clvrai.com/sprint.

arxiv情報

著者	Jesse Zhang,Karl Pertsch,Jiahui Zhang,Joseph J. Lim
発行日	2023-06-20 20:59:10+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー