Autonomous vehicle decision and control through reinforcement learning with traffic flow randomization

要約

強化学習に基づく自動運転車の意思決定と制御タスクに関する現在の研究のほとんどは、シミュレートされた環境で行われています。
これらの調査のトレーニングとテストは、ルールベースの微細なトラフィックフローの下で実行され、パフォーマンスをテストするために実際の環境または現実に近い環境に移行することはほとんど考慮されていません。
トレーニングされたモデルをより現実的な交通シーンでテストすると、パフォーマンスの低下につながる可能性があります。
本研究では、SUMOにおけるルールベースの微視的な交通流の車追従モデルと車線変更モデルの特定パラメータをランダム化することで、周囲車両の運転スタイルや挙動をランダム化する手法を提案する。
私たちは、高速道路と合流シーンにおけるランダム化されたルールベースの微視的交通フローのドメインの下で深層強化学習アルゴリズムを使用してポリシーをトレーニングし、ルールベースの微視的交通フローと高忠実度の微視的交通フローで別々にテストしました。
結果は、ドメインのランダム化トラフィックフローの下でトレーニングされたポリシーが、他の微視的なトラフィックフローの下でトレーニングされたモデルと比較して、成功率と計算上の報酬が大幅に優れていることを示しています。

要約(オリジナル)

Most of the current studies on autonomous vehicle decision-making and control tasks based on reinforcement learning are conducted in simulated environments. The training and testing of these studies are carried out under rule-based microscopic traffic flow, with little consideration of migrating them to real or near-real environments to test their performance. It may lead to a degradation in performance when the trained model is tested in more realistic traffic scenes. In this study, we propose a method to randomize the driving style and behavior of surrounding vehicles by randomizing certain parameters of the car-following model and the lane-changing model of rule-based microscopic traffic flow in SUMO. We trained policies with deep reinforcement learning algorithms under the domain randomized rule-based microscopic traffic flow in freeway and merging scenes, and then tested them separately in rule-based microscopic traffic flow and high-fidelity microscopic traffic flow. Results indicate that the policy trained under domain randomization traffic flow has significantly better success rate and calculative reward compared to the models trained under other microscopic traffic flows.

arxiv情報

著者	Yuan Lin,Antai Xie,Xiao Liu
発行日	2024-03-05 11:41:43+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Autonomous vehicle decision and control through reinforcement learning with traffic flow randomization

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー