One Policy to Run Them All: an End-to-end Learning Approach to Multi-Embodiment Locomotion

要約

深層強化学習技術は、堅牢な脚の移動において最先端の結果を達成しています。
四足動物、人型ロボット、六足動物など、さまざまな脚のプラットフォームが存在しますが、この分野には、これらすべてのさまざまな実施形態を簡単かつ効果的に制御し、おそらくはゼロショットまたは数ショットで目に見えないロボットに転送できる単一の学習フレームワークがまだ不足しています。
実施形態。
このギャップを埋めるために、統合ロボット形態アーキテクチャである URMA を導入します。
私たちのフレームワークは、エンドツーエンドのマルチタスク強化学習アプローチを脚式ロボットの領域にもたらし、学習されたポリシーであらゆるタイプのロボットの形態を制御できるようにします。
私たちの方法の重要なアイデアは、形態にとらわれないエンコーダとデコーダのおかげで、ネットワークが実施形態間でシームレスに共有できる抽象的な移動コントローラを学習できるようにすることです。
この柔軟なアーキテクチャは、脚式ロボットの移動のための基礎モデルを構築する際の潜在的な最初のステップと見なすことができます。
私たちの実験は、URMA が複数の実施形態に関する移動ポリシーを学習し、シミュレーションや現実世界の目に見えないロボットプラットフォームに簡単に転送できることを示しています。

要約(オリジナル)

Deep Reinforcement Learning techniques are achieving state-of-the-art results in robust legged locomotion. While there exists a wide variety of legged platforms such as quadruped, humanoids, and hexapods, the field is still missing a single learning framework that can control all these different embodiments easily and effectively and possibly transfer, zero or few-shot, to unseen robot embodiments. We introduce URMA, the Unified Robot Morphology Architecture, to close this gap. Our framework brings the end-to-end Multi-Task Reinforcement Learning approach to the realm of legged robots, enabling the learned policy to control any type of robot morphology. The key idea of our method is to allow the network to learn an abstract locomotion controller that can be seamlessly shared between embodiments thanks to our morphology-agnostic encoders and decoders. This flexible architecture can be seen as a potential first step in building a foundation model for legged robot locomotion. Our experiments show that URMA can learn a locomotion policy on multiple embodiments that can be easily transferred to unseen robot platforms in simulation and the real world.

arxiv情報

著者	Nico Bohlinger,Grzegorz Czechmanowski,Maciej Krupka,Piotr Kicki,Krzysztof Walas,Jan Peters,Davide Tateo
発行日	2024-09-10 09:44:15+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

One Policy to Run Them All: an End-to-end Learning Approach to Multi-Embodiment Locomotion

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー