Searching for Optimal Runtime Assurance via Reachability and Reinforcement Learning

要約

特定のプラントの実行時保証システム (RTA) を使用すると、バックアップ (または安全) コントローラで安全性を確保しながら、信頼されていないコントローラや実験的なコントローラの実行が可能になります。
関連する計算設計の問題は、信頼できないコントローラの使用など、いくつかのパフォーマンス基準を最大化しながら、必要に応じて安全コントローラに切り替えることで安全を確保するロジックを作成することです。
既存の RTA 設計戦略は過度に保守的であることがよく知られており、原理的には安全性違反につながる可能性があります。
この論文では、最適な RTA 設計問題を定式化し、それを解決するための新しいアプローチを提示します。
私たちのアプローチは、報酬形成と強化学習に依存しています。
安全性を保証し、機械学習テクノロジーを活用してスケーラビリティを実現します。
私たちはこのアルゴリズムを実装し、複雑な安全要件を持つ 3D 空間の航空機モデルを使用した多くのシナリオで、私たちのアプローチを最先端の到達可能性およびシミュレーションベースの RTA アプローチと比較した実験結果を示します。
私たちのアプローチは、既存のアプローチよりも実験用コントローラーの利用率を高めながら、安全性を保証できます。

要約(オリジナル)

A runtime assurance system (RTA) for a given plant enables the exercise of an untrusted or experimental controller while assuring safety with a backup (or safety) controller. The relevant computational design problem is to create a logic that assures safety by switching to the safety controller as needed, while maximizing some performance criteria, such as the utilization of the untrusted controller. Existing RTA design strategies are well-known to be overly conservative and, in principle, can lead to safety violations. In this paper, we formulate the optimal RTA design problem and present a new approach for solving it. Our approach relies on reward shaping and reinforcement learning. It can guarantee safety and leverage machine learning technologies for scalability. We have implemented this algorithm and present experimental results comparing our approach with state-of-the-art reachability and simulation-based RTA approaches in a number of scenarios using aircraft models in 3D space with complex safety requirements. Our approach can guarantee safety while increasing utilization of the experimental controller over existing approaches.

arxiv情報

著者	Kristina Miller,Christopher K. Zeitler,William Shen,Kerianne Hobbs,Sayan Mitra,John Schierman,Mahesh Viswanathan
発行日	2023-10-06 14:45:57+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Searching for Optimal Runtime Assurance via Reachability and Reinforcement Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー