PoCo: Policy Composition from and for Heterogeneous Robot Learning

要約

さまざまなタスクの異種データから一般的なロボットポリシーをトレーニングすることは、大きな課題です。
既存のロボットデータセットは、色、深度、触覚、固有受容情報などのさまざまなモダリティで異なり、シミュレーション、実際のロボット、人間のビデオなどのさまざまなドメインで収集されています。
現在の方法では、通常、1 つのドメインからすべてのデータを収集してプールし、タスクやドメインの異質性を処理する単一のポリシーをトレーニングしますが、これは法外に高価で困難です。
この研究では、拡散モデルで表現されたさまざまなデータ分布を合成することで、シーンレベルおよびタスクレベルの一般化された操作スキルを学習するために、このような多様なモダリティとドメインにわたる情報を組み合わせる、ポリシー合成と呼ばれる柔軟なアプローチを紹介します。
私たちの方法は、マルチタスク操作にタスクレベルの合成を使用でき、推論時にポリシーの動作を適応させるために分析コスト関数と合成できます。
私たちはシミュレーション、人間、実際のロボットのデータに基づいてメソッドをトレーニングし、ツールを使用するタスクで評価します。
構成されたポリシーは、さまざまなシーンやタスクの下で堅牢かつ機敏なパフォーマンスを実現し、シミュレーションと実世界の実験の両方で単一のデータソースからのベースラインを上回るパフォーマンスを実現します。
詳細については、https://liruiw.github.io/policycomp を参照してください。

要約(オリジナル)

Training general robotic policies from heterogeneous data for different tasks is a significant challenge. Existing robotic datasets vary in different modalities such as color, depth, tactile, and proprioceptive information, and collected in different domains such as simulation, real robots, and human videos. Current methods usually collect and pool all data from one domain to train a single policy to handle such heterogeneity in tasks and domains, which is prohibitively expensive and difficult. In this work, we present a flexible approach, dubbed Policy Composition, to combine information across such diverse modalities and domains for learning scene-level and task-level generalized manipulation skills, by composing different data distributions represented with diffusion models. Our method can use task-level composition for multi-task manipulation and be composed with analytic cost functions to adapt policy behaviors at inference time. We train our method on simulation, human, and real robot data and evaluate in tool-use tasks. The composed policy achieves robust and dexterous performance under varying scenes and tasks and outperforms baselines from a single data source in both simulation and real-world experiments. See https://liruiw.github.io/policycomp for more details .

arxiv情報

著者	Lirui Wang,Jialiang Zhao,Yilun Du,Edward H. Adelson,Russ Tedrake
発行日	2024-12-01 15:55:50+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

PoCo: Policy Composition from and for Heterogeneous Robot Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー