Graph-based Prediction and Planning Policy Network (GP3Net) for scalable self-driving in dynamic environments using Deep Reinforcement Learning

要約

自動運転車 (AV) の動作計画における最近の進歩により、非静止運転環境での専門ドライバーの行動の利用に大きな期待が寄せられています。
ただし、専門ドライバーのみを通して学習するには、交通参加者の動的な行動や気象条件によるドメインの変化や障害に近いシナリオから回復するために、より汎用性が必要です。
深いグラフベースの予測および計画ポリシーネットワーク (GP3Net) フレームワークが非定常環境向けに提案されており、コンテキスト情報を使用して交通参加者間の相互作用をエンコードし、AV の安全な操作に関する決定を提供します。
時空間グラフは、交通参加者の将来の軌跡を予測するために、交通参加者間の相互作用をモデル化します。
予測された軌道は、進化する非定常運転環境を予測するために不確実性が組み込まれた AV の周囲の将来の占有マップを生成するために利用されます。
次に、コンテキスト情報と将来の占有マップが GP3Net フレームワークのポリシーネットワークに入力され、近接ポリシー最適化 (PPO) アルゴリズムを使用してトレーニングされます。
提案された GP3Net のパフォーマンスは、トラフィックパターン (都市部、高速道路、混合) のドメインシフトを伴う標準的な CARLA ベンチマークシナリオで評価されます。
結果は、GP3Net がさまざまな町に対するこれまでの最先端の模倣学習ベースの計画モデルよりも優れていることを示しています。
さらに、これまでに見たことのない新たな気象条件でも、GP3Net は交通違反を少なくして目的のルートを完了します。
最後に、この結果は、非定常環境における安全対策を強化するために予測モジュールを組み込むことの利点を強調しています。

要約(オリジナル)

Recent advancements in motion planning for Autonomous Vehicles (AVs) show great promise in using expert driver behaviors in non-stationary driving environments. However, learning only through expert drivers needs more generalizability to recover from domain shifts and near-failure scenarios due to the dynamic behavior of traffic participants and weather conditions. A deep Graph-based Prediction and Planning Policy Network (GP3Net) framework is proposed for non-stationary environments that encodes the interactions between traffic participants with contextual information and provides a decision for safe maneuver for AV. A spatio-temporal graph models the interactions between traffic participants for predicting the future trajectories of those participants. The predicted trajectories are utilized to generate a future occupancy map around the AV with uncertainties embedded to anticipate the evolving non-stationary driving environments. Then the contextual information and future occupancy maps are input to the policy network of the GP3Net framework and trained using Proximal Policy Optimization (PPO) algorithm. The proposed GP3Net performance is evaluated on standard CARLA benchmarking scenarios with domain shifts of traffic patterns (urban, highway, and mixed). The results show that the GP3Net outperforms previous state-of-the-art imitation learning-based planning models for different towns. Further, in unseen new weather conditions, GP3Net completes the desired route with fewer traffic infractions. Finally, the results emphasize the advantage of including the prediction module to enhance safety measures in non-stationary environments.

arxiv情報

著者	Jayabrata Chowdhury,Venkataramanan Shivaraman,Suresh Sundaram,P B Sujit
発行日	2023-12-10 06:04:45+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Graph-based Prediction and Planning Policy Network (GP3Net) for scalable self-driving in dynamic environments using Deep Reinforcement Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー