End-to-End Reinforcement Learning of Koopman Models for Economic Nonlinear MPC

要約

(経済的）非線形モデル予測制御（(e)NMPC）は、関連するすべての状態空間領域において十分に正確な動的システムモデルを必要とする。また、これらのモデルは、リアルタイムの扱いやすさを確保するために、十分に計算コストが低くなければなりません。しかし、このようなモデルは一般的に、シミュレーションサンプルの平均予測精度を最大にするためにシステム同定によって学習され、実際の(e)NMPCの一部として最適な性能を発揮しません。本論文では、(e)NMPCアプリケーションにおいて最適な性能を発揮するための動的代理モデルのエンド・ツー・エンド強化学習法を紹介し、制御性能と計算負荷のバランスが取れた予測制御器を実現する。我々は、確立された非線形連続攪拌タンク反応器モデルから得られた2つのアプリケーションで本手法を検証する。一般的な最大予測精度のパラダイムで学習させたモデルを用いたMPCや、強化学習を用いて学習させたモデルを用いないニューラルネットワークコントローラの性能と比較する。その結果、本手法はモデルフリーのニューラルネットワークコントローラの性能に匹敵する一方、システム同定に由来するモデルを常に凌駕することが示された。さらに、MPCポリシーが再トレーニングなしで制御設定の変化に対応できることを示す。

要約(オリジナル)

(Economic) nonlinear model predictive control ((e)NMPC) requires dynamic system models that are sufficiently accurate in all relevant state-space regions. These models must also be computationally cheap enough to ensure real-time tractability. Data-driven surrogate models for mechanistic models can be used to reduce the computational burden of (e)NMPC; however, such models are typically trained by system identification for maximum average prediction accuracy on simulation samples and perform suboptimally as part of actual (e)NMPC. We present a method for end-to-end reinforcement learning of dynamic surrogate models for optimal performance in (e)NMPC applications, resulting in predictive controllers that strike a favorable balance between control performance and computational demand. We validate our method on two applications derived from an established nonlinear continuous stirred-tank reactor model. We compare the controller performance to that of MPCs utilizing models trained by the prevailing maximum prediction accuracy paradigm, and model-free neural network controllers trained using reinforcement learning. We show that our method matches the performance of the model-free neural network controllers while consistently outperforming models derived from system identification. Additionally, we show that the MPC policies can react to changes in the control setting without retraining.

arxiv情報

著者	Daniel Mayfrank,Alexander Mitsos,Manuel Dahmen
発行日	2023-08-03 10:21:53+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, DeepL

End-to-End Reinforcement Learning of Koopman Models for Economic Nonlinear MPC

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー