Multi-Objective Deep Reinforcement Learning for Optimisation in Autonomous Systems

要約

強化学習 (RL) は、環境のモデルや事前定義されたアクションを必要とせずに実行時に学習できるため、自律システム (AS) で広く使用されています。
ただし、Q 学習に基づくものなど、AS における RL のほとんどのアプリケーションは 1 つの目的のみを最適化できるため、多目的システムでは、事前定義された重みを持つ単一の目的関数で複数の目的を組み合わせる必要があります。
多数の多目的強化学習 (MORL) 手法が存在しますが、それらはほとんどが現実世界の AS システムではなく RL ベンチマークに適用されています。
この研究では、Deep W-Learning (DWN) と呼ばれる MORL 手法を使用し、それを自己適応サーバーである Emergent Web Servers のサンプルに適用して、実行時のパフォーマンスを最適化するための最適な構成を見つけます。
DWN を、{\epsilon}-greedy アルゴリズムと Deep Q-Networks という 2 つの単一目的最適化実装と比較します。
私たちの初期評価では、DWN は複数の目的を同時に最適化し、DQN や {\epsilon}-greedy アプローチと比べて同様の結果をもたらし、一部のメトリクスでより優れたパフォーマンスを示し、複数の目的を 1 つの効用関数に結合することに関連する問題を回避することが示されています。

要約(オリジナル)

Reinforcement Learning (RL) is used extensively in Autonomous Systems (AS) as it enables learning at runtime without the need for a model of the environment or predefined actions. However, most applications of RL in AS, such as those based on Q-learning, can only optimize one objective, making it necessary in multi-objective systems to combine multiple objectives in a single objective function with predefined weights. A number of Multi-Objective Reinforcement Learning (MORL) techniques exist but they have mostly been applied in RL benchmarks rather than real-world AS systems. In this work, we use a MORL technique called Deep W-Learning (DWN) and apply it to the Emergent Web Servers exemplar, a self-adaptive server, to find the optimal configuration for runtime performance optimization. We compare DWN to two single-objective optimization implementations: {\epsilon}-greedy algorithm and Deep Q-Networks. Our initial evaluation shows that DWN optimizes multiple objectives simultaneously with similar results than DQN and {\epsilon}-greedy approaches, having a better performance for some metrics, and avoids issues associated with combining multiple objectives into a single utility function.

arxiv情報

著者	Juan C. Rosero,Ivana Dusparic,Nicolás Cardozo
発行日	2024-09-30 13:15:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Multi-Objective Deep Reinforcement Learning for Optimisation in Autonomous Systems

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー