Autonomous Decision Making for UAV Cooperative Pursuit-Evasion Game with Reinforcement Learning

要約

無人航空機 (UAV) におけるインテリジェントな意思決定の応用が増加しており、UAV 1 対 1 の追跡回避ゲームの開発に伴い、複数の UAV 協力ゲームが新たな課題として浮上しています。
この論文では、複雑なゲーム環境において UAV が自律的に意思決定できるようにするという課題に対処するために、マルチロール UAV の協力追跡回避ゲームにおける意思決定のための深層強化学習ベースのモデルを提案します。
高次元の状態行動空間を持つUAV追跡回避ゲーム環境における強化学習アルゴリズムの訓練効率を高めるために、本論文では、優先経験リプレイアルゴリズムを備えたマルチ環境非同期ダブルディープQネットワークを提案し、強化学習アルゴリズムを効果的に訓練する。
UAV のゲームポリシー。
さらに、協力能力とタスク完了効率を向上させ、追跡回避ゲームにおける UAV のコストを最小限に抑えることを目的として、本稿はマルチ UAV 環境内での役割とターゲットの割り当てに焦点を当てます。
さまざまな数の UAV を使用した協力ゲーム決定モデルは、さまざまなシナリオでさまざまなタスクと役割を UAV に割り当てることによって取得されます。
シミュレーション結果は、提案された方法が追跡回避ゲームシナリオにおける UAV の自律的な意思決定を可能にし、協力して重要な機能を発揮することを示しています。

要約(オリジナル)

The application of intelligent decision-making in unmanned aerial vehicle (UAV) is increasing, and with the development of UAV 1v1 pursuit-evasion game, multi-UAV cooperative game has emerged as a new challenge. This paper proposes a deep reinforcement learning-based model for decision-making in multi-role UAV cooperative pursuit-evasion game, to address the challenge of enabling UAV to autonomously make decisions in complex game environments. In order to enhance the training efficiency of the reinforcement learning algorithm in UAV pursuit-evasion game environment that has high-dimensional state-action space, this paper proposes multi-environment asynchronous double deep Q-network with priority experience replay algorithm to effectively train the UAV’s game policy. Furthermore, aiming to improve cooperation ability and task completion efficiency, as well as minimize the cost of UAVs in the pursuit-evasion game, this paper focuses on the allocation of roles and targets within multi-UAV environment. The cooperative game decision model with varying numbers of UAVs are obtained by assigning diverse tasks and roles to the UAVs in different scenarios. The simulation results demonstrate that the proposed method enables autonomous decision-making of the UAVs in pursuit-evasion game scenarios and exhibits significant capabilities in cooperation.

arxiv情報

著者	Yang Zhao,Zidong Nie,Kangsheng Dong,Qinghua Huang,Xuelong Li
発行日	2024-11-05 10:45:30+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Autonomous Decision Making for UAV Cooperative Pursuit-Evasion Game with Reinforcement Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー