Distributionally Robust Policy and Lyapunov-Certificate Learning

要約

この記事では、モデルの不確実性の下で、分布的に堅牢な安定化ニューラルコントローラーと制御システムの証明書を合成するための新しい方法を紹介します。
不確実なシステムの安定性が保証されたコントローラーを設計する際の主な課題は、オンライン展開中のモデルのパラメータの不確実性の変化を正確に判断し、それに適応させることです。
私たちは、リアプノフ証明書の単調減少を保証するリアプノフ微分確率制約の新しい分布的に堅牢な定式化を使用してこれに取り組みます。
確率測度の空間の処理に伴う計算の複雑さを回避するために、リアプノフ微分制約が確実に満たされることを保証する決定論的な凸制約の形式で十分条件を特定します。
この条件をニューラルネットワークベースのコントローラーをトレーニングするための損失関数に統合し、結果として得られる閉ループシステムについて、その平衡状態の大域的漸近安定性が、分布外であっても高い信頼度で証明できることを示します (
OoD) モデルの不確実性。
提案された方法論の有効性と効率を実証するために、シミュレーションにおける 2 つの制御問題における不確実性を問わないベースラインアプローチおよびいくつかの強化学習アプローチと比較します。

要約(オリジナル)

This article presents novel methods for synthesizing distributionally robust stabilizing neural controllers and certificates for control systems under model uncertainty. A key challenge in designing controllers with stability guarantees for uncertain systems is the accurate determination of and adaptation to shifts in model parametric uncertainty during online deployment. We tackle this with a novel distributionally robust formulation of the Lyapunov derivative chance constraint ensuring a monotonic decrease of the Lyapunov certificate. To avoid the computational complexity involved in dealing with the space of probability measures, we identify a sufficient condition in the form of deterministic convex constraints that ensures the Lyapunov derivative constraint is satisfied. We integrate this condition into a loss function for training a neural network-based controller and show that, for the resulting closed-loop system, the global asymptotic stability of its equilibrium can be certified with high confidence, even with Out-of-Distribution (OoD) model uncertainties. To demonstrate the efficacy and efficiency of the proposed methodology, we compare it with an uncertainty-agnostic baseline approach and several reinforcement learning approaches in two control problems in simulation.

arxiv情報

著者	Kehan Long,Jorge Cortes,Nikolay Atanasov
発行日	2024-08-03 18:43:14+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Distributionally Robust Policy and Lyapunov-Certificate Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー