Spatial-Aware Deep Reinforcement Learning for the Traveling Officer Problem


巡回役員問題 (TOP) は、確率論的最適化の困難なタスクです。
TOP の主な課題は、駐車違反の動的な性質であり、罰金が課されているかどうかに関係なく、ランダムに現れてはしばらくすると消えてしまいます。
この論文では、TOP 向けの新しい空間認識型深層強化学習アプローチである SATOP を提案します。
私たちの結果は、SATOP が最先端の TOP エージェントを常に上回っており、駐車違反に対して最大 22% 多くの罰金を科すことができることを示しています。


The traveling officer problem (TOP) is a challenging stochastic optimization task. In this problem, a parking officer is guided through a city equipped with parking sensors to fine as many parking offenders as possible. A major challenge in TOP is the dynamic nature of parking offenses, which randomly appear and disappear after some time, regardless of whether they have been fined. Thus, solutions need to dynamically adjust to currently fineable parking offenses while also planning ahead to increase the likelihood that the officer arrives during the offense taking place. Though various solutions exist, these methods often struggle to take the implications of actions on the ability to fine future parking violations into account. This paper proposes SATOP, a novel spatial-aware deep reinforcement learning approach for TOP. Our novel state encoder creates a representation of each action, leveraging the spatial relationships between parking spots, the agent, and the action. Furthermore, we propose a novel message-passing module for learning future inter-action correlations in the given environment. Thus, the agent can estimate the potential to fine further parking violations after executing an action. We evaluate our method using an environment based on real-world data from Melbourne. Our results show that SATOP consistently outperforms state-of-the-art TOP agents and is able to fine up to 22% more parking offenses.


著者 Niklas Strauß,Matthias Schubert
発行日 2024-01-11 15:16:20+00:00
