Domain-adapted Learning and Imitation: DRL for Power Arbitrage

要約

この論文では、オランダの電力市場について説明します。この市場は、前日市場とオークションのように動作する日中平衡市場で構成されています。
電力の需要と供給の変動により、多くの場合不均衡が生じ、2 つの市場で価格の相違が生じ、裁定取引の機会が生じます。
この問題に対処するために、我々は問題を再構成し、このバイレベルシミュレーションと欧州電力裁定取引の最適化のための協調的なデュアルエージェント強化学習アプローチを提案します。
また、電力トレーダーの取引行動を模倣することで、ドメイン固有の知識を組み込むように設計された 2 つの新しい実装も紹介します。
報酬エンジニアリングを利用してドメインの専門知識を模倣することで、RL エージェントの報酬システムを改革することができ、トレーニング中の収束が向上し、全体的なパフォーマンスが向上します。
さらに、注文のトランシングにより入札の成功率が向上し、損益 (P&L) が大幅に増加します。
私たちの調査では、一般的な学習問題でドメインの専門知識を活用することでパフォーマンスが大幅に向上し、最終的に統合されたアプローチにより、元のエージェントと比較して累積損益が 3 倍向上することが実証されました。
さらに、私たちの方法論は、効率的な計算パフォーマンスを維持しながら、最高のベンチマークポリシーを約 50% 上回ります。

要約(オリジナル)

In this paper, we discuss the Dutch power market, which is comprised of a day-ahead market and an intraday balancing market that operates like an auction. Due to fluctuations in power supply and demand, there is often an imbalance that leads to different prices in the two markets, providing an opportunity for arbitrage. To address this issue, we restructure the problem and propose a collaborative dual-agent reinforcement learning approach for this bi-level simulation and optimization of European power arbitrage trading. We also introduce two new implementations designed to incorporate domain-specific knowledge by imitating the trading behaviours of power traders. By utilizing reward engineering to imitate domain expertise, we are able to reform the reward system for the RL agent, which improves convergence during training and enhances overall performance. Additionally, the tranching of orders increases bidding success rates and significantly boosts profit and loss (P&L). Our study demonstrates that by leveraging domain expertise in a general learning problem, the performance can be improved substantially, and the final integrated approach leads to a three-fold improvement in cumulative P&L compared to the original agent. Furthermore, our methodology outperforms the highest benchmark policy by around 50% while maintaining efficient computational performance.

arxiv情報

著者	Yuanrong Wang,Vignesh Raja Swaminathan,Nikita P. Granger,Carlos Ros Perez,Christian Michler
発行日	2023-08-02 16:49:37+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Domain-adapted Learning and Imitation: DRL for Power Arbitrage

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー