EgoMimic: Scaling Imitation Learning via Egocentric Video

要約

模倣学習に必要な実証データの規模と多様性は、大きな課題です。
私たちは、人間の具体化データ、特に 3D ハンドトラッキングと組み合わせた自己中心的な人間のビデオを介して操作をスケールするフルスタックフレームワークである EgoMimic を紹介します。
EgoMimic は、(1) 人間工学に基づいた Project Aria メガネを使用して人間の身体データをキャプチャするシステム、(2) 人間のデータとの運動学的ギャップを最小限に抑える低コストの両手マニピュレーター、(3) クロスドメインのデータ調整技術、
(4) 人間とロボットのデータを共同トレーニングする模倣学習アーキテクチャ。
人間のビデオから高レベルの意図のみを抽出する以前の研究と比較して、私たちのアプローチは人間とロボットのデータを具体化されたデモンストレーションデータとして同等に扱い、両方のデータソースから統一されたポリシーを学習します。
EgoMimic は、長期にわたる単腕および両手操作のさまざまなタスクにおいて、最先端の模倣学習方法に比べて大幅な改善を達成し、まったく新しいシーンへの一般化を可能にします。
最後に、EgoMimic の好ましいスケーリング傾向を示します。ここでは、1 時間のハンドデータの追加の方が、1 時間のロボットデータの追加よりもはるかに価値があります。
ビデオと追加情報は https://egomimic.github.io/ でご覧いただけます。

要約(オリジナル)

The scale and diversity of demonstration data required for imitation learning is a significant challenge. We present EgoMimic, a full-stack framework which scales manipulation via human embodiment data, specifically egocentric human videos paired with 3D hand tracking. EgoMimic achieves this through: (1) a system to capture human embodiment data using the ergonomic Project Aria glasses, (2) a low-cost bimanual manipulator that minimizes the kinematic gap to human data, (3) cross-domain data alignment techniques, and (4) an imitation learning architecture that co-trains on human and robot data. Compared to prior works that only extract high-level intent from human videos, our approach treats human and robot data equally as embodied demonstration data and learns a unified policy from both data sources. EgoMimic achieves significant improvement on a diverse set of long-horizon, single-arm and bimanual manipulation tasks over state-of-the-art imitation learning methods and enables generalization to entirely new scenes. Finally, we show a favorable scaling trend for EgoMimic, where adding 1 hour of additional hand data is significantly more valuable than 1 hour of additional robot data. Videos and additional information can be found at https://egomimic.github.io/

arxiv情報

著者	Simar Kareer,Dhruv Patel,Ryan Punamiya,Pranay Mathur,Shuo Cheng,Chen Wang,Judy Hoffman,Danfei Xu
発行日	2024-10-31 17:59:55+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

EgoMimic: Scaling Imitation Learning via Egocentric Video

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー