Task-priority Intermediated Hierarchical Distributed Policies: Reinforcement Learning of Adaptive Multi-robot Cooperative Transport

要約

マルチロボットの協調搬送は、物流、家事、災害対応において極めて重要である。しかし、様々な重さの物体が混在し、ロボットや物体の数が異なる環境では、大きな課題がある。本論文では、階層的なポリシー構造によってこれらの課題に対処するマルチエージェント強化学習（RL）フレームワークであるタスク優先中間階層分散ポリシー（Task-priority Intermediated Hierarchical Distributed Policies: TIHDP）を紹介する。TIHDPは、タスク割り当てポリシー（上位層）、動的タスク優先度（中間層）、ロボット制御ポリシー（下位層）の3層から構成される。動的タスク優先度層は、グローバルな物体情報を受信し、他のロボットと通信することで、任意の搬送対象物の優先度を操作することができるのに対し、タスク割り当てとロボット制御ポリシーは、局所的な観測/アクションによって制限されるため、物体やロボットの数の変化に影響されない。TIHDPは、シミュレーションと実ロボットによるデモンストレーションを通じて、ロボットや物体の数が変化する環境においても、学習されたマルチロボット協調輸送の有望な適応性と性能を示す。ビデオはhttps://youtu.be/Rmhv5ovj0xM。

要約(オリジナル)

Multi-robot cooperative transport is crucial in logistics, housekeeping, and disaster response. However, it poses significant challenges in environments where objects of various weights are mixed and the number of robots and objects varies. This paper presents Task-priority Intermediated Hierarchical Distributed Policies (TIHDP), a multi-agent Reinforcement Learning (RL) framework that addresses these challenges through a hierarchical policy structure. TIHDP consists of three layers: task allocation policy (higher layer), dynamic task priority (intermediate layer), and robot control policy (lower layer). Whereas the dynamic task priority layer can manipulate the priority of any object to be transported by receiving global object information and communicating with other robots, the task allocation and robot control policies are restricted by local observations/actions so that they are not affected by changes in the number of objects and robots. Through simulations and real-robot demonstrations, TIHDP shows promising adaptability and performance of the learned multi-robot cooperative transport, even in environments with varying numbers of robots and objects. Video is available at https://youtu.be/Rmhv5ovj0xM

arxiv情報

著者	Yusei Naito,Tomohiko Jimbo,Tadashi Odashima,Takamitsu Matsubara
発行日	2024-04-02 23:19:02+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Task-priority Intermediated Hierarchical Distributed Policies: Reinforcement Learning of Adaptive Multi-robot Cooperative Transport

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー