Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous

要約

この研究では、深層強化学習 (RL) の分野からマスクされた近接ポリシー最適化 (PPO) アルゴリズムの新しいアプリケーションを導入し、Izzo が個別のランデブーに適応させたように Lambert ソルバーを利用して、最も効率的なスペースデブリの訪問順序を決定します。
目的は、ミッション全体のランデブーの合計時間を最短にするために、指定されたすべてのデブリを訪問する順序を最適化することです。
ニューラルネットワーク (NN) ポリシーが開発され、さまざまなデブリフィールドを使用したシミュレートされた宇宙ミッションでトレーニングされます。
トレーニング後、ニューラルネットワークは、Izzo によるランバート操作の適応を使用して、ほぼ最適なパスを計算します。
パフォーマンスは、ミッション計画における標準ヒューリスティックに対して評価されます。
強化学習アプローチは、デブリランデブーのシーケンスを最適化することで計画効率が大幅に向上し、Genetic アルゴリズムと Greedy アルゴリズムと比較して、合計ミッション時間をそれぞれ平均約 {10.96\%} と {13.66\%} 短縮することを示しています。
このモデルは、平均して、最速の計算速度で、さまざまなシミュレートされたシナリオにわたって、最も時間効率の高いデブリ到達シーケンスを特定します。
このアプローチは、スペースデブリ除去のためのミッション計画戦略の強化における一歩前進を意味します。

要約(オリジナル)

This research introduces a novel application of a masked Proximal Policy Optimization (PPO) algorithm from the field of deep reinforcement learning (RL), for determining the most efficient sequence of space debris visitation, utilizing the Lambert solver as per Izzo’s adaptation for individual rendezvous. The aim is to optimize the sequence in which all the given debris should be visited to get the least total time for rendezvous for the entire mission. A neural network (NN) policy is developed, trained on simulated space missions with varying debris fields. After training, the neural network calculates approximately optimal paths using Izzo’s adaptation of Lambert maneuvers. Performance is evaluated against standard heuristics in mission planning. The reinforcement learning approach demonstrates a significant improvement in planning efficiency by optimizing the sequence for debris rendezvous, reducing the total mission time by an average of approximately {10.96\%} and {13.66\%} compared to the Genetic and Greedy algorithms, respectively. The model on average identifies the most time-efficient sequence for debris visitation across various simulated scenarios with the fastest computational speed. This approach signifies a step forward in enhancing mission planning strategies for space debris clearance.

arxiv情報

著者	Agni Bandyopadhyay,Guenther Waxenegger-Wilfing
発行日	2024-09-25 12:50:01+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー