Safe DreamerV3: Safe Reinforcement Learning with World Models

要約

現実世界の状況における強化学習 (RL) の広範な適用は、まだ実現していません。その主な原因は、強化学習 (RL) がそのようなシステムの本質的な安全要求を満たしていないことです。
安全性を強化するためにコスト関数を採用する既存の安全強化学習 (SafeRL) 手法は、包括的なデータサンプリングとトレーニングを行ったとしても、視覚のみのタスクを含む複雑なシナリオではゼロコストを達成できません。
これに対処するために、ワールドモデル内でラグランジュベースの手法と計画ベースの手法の両方を統合する新しいアルゴリズムである Safe DreamerV3 を導入します。
私たちの方法論は、SafeRL の大幅な進歩を表しており、Safety-Gymnasium ベンチマーク内の低次元タスクとビジョンのみのタスクの両方でほぼゼロコストを達成する最初のアルゴリズムです。
私たちのプロジェクトの Web サイトは、https://sites.google.com/view/safedreamerv3 にあります。

要約(オリジナル)

The widespread application of Reinforcement Learning (RL) in real-world situations is yet to come to fruition, largely as a result of its failure to satisfy the essential safety demands of such systems. Existing safe reinforcement learning (SafeRL) methods, employing cost functions to enhance safety, fail to achieve zero-cost in complex scenarios, including vision-only tasks, even with comprehensive data sampling and training. To address this, we introduce Safe DreamerV3, a novel algorithm that integrates both Lagrangian-based and planning-based methods within a world model. Our methodology represents a significant advancement in SafeRL as the first algorithm to achieve nearly zero-cost in both low-dimensional and vision-only tasks within the Safety-Gymnasium benchmark. Our project website can be found in: https://sites.google.com/view/safedreamerv3.

arxiv情報

著者	Weidong Huang,Jiaming Ji,Borong Zhang,Chunhe Xia,Yaodong Yang
発行日	2023-07-14 06:00:08+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Safe DreamerV3: Safe Reinforcement Learning with World Models

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー