Speeding Up Path Planning via Reinforcement Learning in MCTS for Automated Parking

要約

この論文では、強化学習をモンテカルロ木探索に統合して、自動駐車タスクのための完全に観察可能な環境下でのオンライン経路計画を強化する方法について取り上げます。
高次元空間でのサンプリングベースの計画手法は、計算コストと時間がかかる可能性があります。
状態評価方法は、事前の知識を検索ステップに活用することで役立ち、リアルタイムシステムでのプロセスを高速化します。
自動駐車タスクは複雑な環境で実行されることが多いという事実を考慮すると、従来の分析方法で堅牢かつ軽量のヒューリスティックガイダンスを作成するのは困難です。
この制限を克服するために、パスプランニングフレームワークの下でモンテカルロツリー検索を備えた強化学習パイプラインを提案します。
状態の値と、前のサイクルの結果からサンプル間で最適なアクションを繰り返し学習することで、特定の状態の値推定器とポリシー生成器をモデル化できます。
これにより、探索と活用の間のバランスをとるメカニズムを構築し、人間の専門ドライバーのデータを使用せずに品質を維持しながら経路計画プロセスを高速化します。

要約(オリジナル)

In this paper, we address a method that integrates reinforcement learning into the Monte Carlo tree search to boost online path planning under fully observable environments for automated parking tasks. Sampling-based planning methods under high-dimensional space can be computationally expensive and time-consuming. State evaluation methods are useful by leveraging the prior knowledge into the search steps, making the process faster in a real-time system. Given the fact that automated parking tasks are often executed under complex environments, a solid but lightweight heuristic guidance is challenging to compose in a traditional analytical way. To overcome this limitation, we propose a reinforcement learning pipeline with a Monte Carlo tree search under the path planning framework. By iteratively learning the value of a state and the best action among samples from its previous cycle’s outcomes, we are able to model a value estimator and a policy generator for given states. By doing that, we build up a balancing mechanism between exploration and exploitation, speeding up the path planning process while maintaining its quality without using human expert driver data.

arxiv情報

著者	Xinlong Zheng,Xiaozhou Zhang,Donghao Xu
発行日	2024-12-31 06:53:26+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Speeding Up Path Planning via Reinforcement Learning in MCTS for Automated Parking

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー