A Reinforcement Learning Approach for Scheduling Problems With Improved Generalization Through Order Swapping

要約

生産資源のスケジューリング（ジョブを機械に関連付けるなど）は、製造業にとって省エネだけでなく、全体の効率を上げるためにも重要な役割を担っている。様々なジョブスケジューリング問題の中で、本作品ではJSSPが取り上げられている。JSSPは、NP困難なCOPに分類され、網羅的な探索によって問題を解くことは不可能である。FIFO、LPTのような単純なヒューリスティックや、Tabooサーチのようなメタヒューリスティクスがしばしば採用され、探索空間を切り詰めて問題を解決することができる。しかし、問題サイズが大きい場合には、最適解から遠ざかったり、時間がかかったりするため、これらの手法の実行可能性は非効率となる。近年、COPの解法にDRLを用いる研究が注目されており、解の品質や計算効率の面で有望な結果が得られている。本研究では、DRLを用いた目的の汎化と解の有効性を検討したJSSPを解くための新しいアプローチを提供する。特に、制約のあるジョブ派遣において有効であるとされる政策勾配パラダイムを採用したPPOアルゴリズムを採用する。また、より良い問題の汎化学習を実現するために、環境中にOSMを組み込んだ。本アプローチの性能は、利用可能なベンチマークインスタンスのセットを使用し、他のグループの仕事と我々の結果を比較することにより、深く分析される。

要約(オリジナル)

The scheduling of production resources (such as associating jobs to machines) plays a vital role for the manufacturing industry not only for saving energy but also for increasing the overall efficiency. Among the different job scheduling problems, the JSSP is addressed in this work. JSSP falls into the category of NP-hard COP, in which solving the problem through exhaustive search becomes unfeasible. Simple heuristics such as FIFO, LPT and metaheuristics such as Taboo search are often adopted to solve the problem by truncating the search space. The viability of the methods becomes inefficient for large problem sizes as it is either far from the optimum or time consuming. In recent years, the research towards using DRL to solve COP has gained interest and has shown promising results in terms of solution quality and computational efficiency. In this work, we provide an novel approach to solve the JSSP examining the objectives generalization and solution effectiveness using DRL. In particular, we employ the PPO algorithm that adopts the policy-gradient paradigm that is found to perform well in the constrained dispatching of jobs. We incorporated an OSM in the environment to achieve better generalized learning of the problem. The performance of the presented approach is analyzed in depth by using a set of available benchmark instances and comparing our results with the work of other groups.

arxiv情報

著者	Deepak Vivekanandan,Samuel Wirth,Patrick Karlbauer,Noah Klarmann
発行日	2023-03-06 14:44:40+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

A Reinforcement Learning Approach for Scheduling Problems With Improved Generalization Through Order Swapping

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー