Scalable Multi-Agent Reinforcement Learning for Warehouse Logistics with Robotic and Human Co-Workers

要約

私たちは、数十台の移動ロボットと人間のピッキング作業者が連携して倉庫内で商品の収集と配送を行う倉庫を想定しています。
私たちが取り組む注文ピッキング問題と呼ばれる基本的な問題は、パフォーマンス (注文のスループットなど) を最大化するために、これらのワーカーエージェントが倉庫内での移動とアクションをどのように調整する必要があるかということです。
ヒューリスティックなアプローチを使用する確立された業界手法では、本質的に変化する倉庫構成を最適化するために大規模なエンジニアリング作業が必要です。
対照的に、マルチエージェント強化学習 (MARL) は、エージェントが経験を通じて相互に最適に協力する方法を学習するため、さまざまな倉庫構成 (サイズ、レイアウト、作業員の数や種類、商品の補充頻度など) に柔軟に適用できます。
私たちは、マネージャーがワーカーエージェントに目標を割り当て、グローバル目標（ピック率など）の最大化に向けてマネージャーとワーカーのポリシーが共同トレーニングされる階層型 MARL アルゴリズムを開発します。
当社の階層アルゴリズムは、さまざまな倉庫構成においてベースライン MARL アルゴリズムと比較してサンプル効率と全体的なピッキング率で大幅な向上を達成し、注文ピッキングシステムに関して確立された 2 つの業界ヒューリスティックを大幅に上回ります。

要約(オリジナル)

We envision a warehouse in which dozens of mobile robots and human pickers work together to collect and deliver items within the warehouse. The fundamental problem we tackle, called the order-picking problem, is how these worker agents must coordinate their movement and actions in the warehouse to maximise performance (e.g. order throughput). Established industry methods using heuristic approaches require large engineering efforts to optimise for innately variable warehouse configurations. In contrast, multi-agent reinforcement learning (MARL) can be flexibly applied to diverse warehouse configurations (e.g. size, layout, number/types of workers, item replenishment frequency), as the agents learn through experience how to optimally cooperate with one another. We develop hierarchical MARL algorithms in which a manager assigns goals to worker agents, and the policies of the manager and workers are co-trained toward maximising a global objective (e.g. pick rate). Our hierarchical algorithms achieve significant gains in sample efficiency and overall pick rates over baseline MARL algorithms in diverse warehouse configurations, and substantially outperform two established industry heuristics for order-picking systems.

arxiv情報

著者	Aleksandar Krnjaic,Raul D. Steleac,Jonathan D. Thomas,Georgios Papoudakis,Lukas Schäfer,Andrew Wing Keung To,Kuan-Ho Lao,Murat Cubuktepe,Matthew Haley,Peter Börsting,Stefano V. Albrecht
発行日	2023-07-07 17:20:25+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Scalable Multi-Agent Reinforcement Learning for Warehouse Logistics with Robotic and Human Co-Workers

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー