RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments

要約

倉庫環境におけるマルチロボットタスク割り当て問題のための新しい強化学習ベースのアルゴリズムを提示します。
これをマルコフ決定プロセスとして定式化し、注目に触発されたポリシーアーキテクチャを備えた新しいディープマルチエージェント強化学習法 (RTAW と呼ばれる) を介して解決します。
したがって、提案されたポリシーネットワークは、ロボット/タスクの数に依存しないグローバルな埋め込みを使用します。
トレーニングに近位ポリシー最適化アルゴリズムを利用し、慎重に設計された報酬を使用して収束ポリシーを取得します。
収束されたポリシーにより、さまざまなロボット間の協調が保証され、総移動遅延 (TTD) が最小限に抑えられます。これにより、最終的には、十分に大きなタスクリストのメイクスパンが改善されます。
私たちの広範な実験では、RTAW アルゴリズムのパフォーマンスを、近視ピックアップ距離の最小化 (欲張り) や、さまざまなナビゲーションスキームでの後悔に基づくベースラインなどの最先端の方法と比較します。
さまざまな困難な倉庫レイアウトとタスク生成スキームの数百または数千のタスクを伴うシナリオで、TTD で最大 14% (25 ～ 1000 秒) の改善が見られます。
また、シミュレーションで最大 1000 ドルのロボットでパフォーマンスを示すことにより、アプローチのスケーラビリティを示します。

要約(オリジナル)

We present a novel reinforcement learning based algorithm for multi-robot task allocation problem in warehouse environments. We formulate it as a Markov Decision Process and solve via a novel deep multi-agent reinforcement learning method (called RTAW) with attention inspired policy architecture. Hence, our proposed policy network uses global embeddings that are independent of the number of robots/tasks. We utilize proximal policy optimization algorithm for training and use a carefully designed reward to obtain a converged policy. The converged policy ensures cooperation among different robots to minimize total travel delay (TTD) which ultimately improves the makespan for a sufficiently large task-list. In our extensive experiments, we compare the performance of our RTAW algorithm to state of the art methods such as myopic pickup distance minimization (greedy) and regret based baselines on different navigation schemes. We show an improvement of upto 14% (25-1000 seconds) in TTD on scenarios with hundreds or thousands of tasks for different challenging warehouse layouts and task generation schemes. We also demonstrate the scalability of our approach by showing performance with up to $1000$ robots in simulations.

arxiv情報

著者	Aakriti Agrawal,Amrit Singh Bedi,Dinesh Manocha
発行日	2023-02-27 18:54:27+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー