Solving the Paint Shop Problem with Flexible Management of Multi-Lane Buffers Using Reinforcement Learning and Action Masking

要約

ペイントショップの問題では、異なる色に割り当てられた車の順序のない入庫順序を、色の変更回数を最小にする目的で再シャッフルしなければならない。入庫順序を入れ替えるために、製造業者は先入れ先出しのマルチレーンバッファシステムを採用することができ、格納と取り出しの操作を可能にする。これまでのところ、先行研究は主に貪欲のような単純な決定ヒューリスティックや、単純化された問題変形に焦点をあてており、格納・取り出し操作を行う際に完全な柔軟性を与えることはできない。本研究では、保存と取り出しの操作を任意の順序で実行できる柔軟な問題変形に対して、色の変化を最小化する強化学習アプローチを提案する。貪欲な検索が最適であることを証明した後、アクションマスキングを用いてこの知見をモデルに組み込む。2〜8個のバッファレーンと5〜15色の色を持つ170個の問題インスタンスに基づく我々の評価は、我々のアプローチが、問題のサイズに依存して、既存の方法と比較して、かなりのマージンで色の変化を低減することを示している。さらに、異なるバッファサイズと不均衡な色分布に対する我々のアプローチの頑健性を示す。

要約(オリジナル)

In the paint shop problem, an unordered incoming sequence of cars assigned to different colors has to be reshuffled with the objective of minimizing the number of color changes. To reshuffle the incoming sequence, manufacturers can employ a first-in-first-out multi-lane buffer system allowing store and retrieve operations. So far, prior studies primarily focused on simple decision heuristics like greedy or simplified problem variants that do not allow full flexibility when performing store and retrieve operations. In this study, we propose a reinforcement learning approach to minimize color changes for the flexible problem variant, where store and retrieve operations can be performed in an arbitrary order. After proving that greedy retrieval is optimal, we incorporate this finding into the model using action masking. Our evaluation, based on 170 problem instances with 2-8 buffer lanes and 5-15 colors, shows that our approach reduces color changes compared to existing methods by considerable margins depending on the problem size. Furthermore, we demonstrate the robustness of our approach towards different buffer sizes and imbalanced color distributions.

arxiv情報

著者	Mirko Stappert,Bernhard Lutz,Janis Brammer,Dirk Neumann
発行日	2025-04-03 14:37:40+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

Solving the Paint Shop Problem with Flexible Management of Multi-Lane Buffers Using Reinforcement Learning and Action Masking

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー