🤖 AI Summary
Photovoltaic (PV) operators face dual uncertainty in power generation output and electricity prices, necessitating optimized intraday market bidding strategies to maximize revenue and minimize imbalance penalties. This paper proposes an interpretable, data-driven reinforcement learning framework. We formulate the problem as a Markov decision process grounded in market microstructure and historical price dynamics. A novel linear policy network is introduced to enhance decision transparency and data efficiency. Integrated with domain-informed feature engineering and the proximal policy optimization (PPO) algorithm, the framework achieves rapid convergence and real-time inference in sequential decision-making settings. Experimental evaluations across multiple realistic scenarios demonstrate that our approach consistently outperforms benchmark strategies—achieving higher revenue, significantly lower imbalance costs, and superior deployment stability. These results validate both its practical applicability and robustness under stochastic generation and price conditions.
📝 Abstract
Photovoltaic (PV) operators face substantial uncertainty in generation and short-term electricity prices. Continuous intraday markets enable producers to adjust their positions in real time, potentially improving revenues and reducing imbalance costs. We propose a feature-driven reinforcement learning (RL) approach for PV intraday trading that integrates data-driven features into the state and learns bidding policies in a sequential decision framework. The problem is cast as a Markov Decision Process with a reward that balances trading profit and imbalance penalties and is solved with Proximal Policy Optimization (PPO) using a predominantly linear, interpretable policy. Trained on historical market data and evaluated out-of-sample, the strategy consistently outperforms benchmark baselines across diverse scenarios. Extensive validation shows rapid convergence, real-time inference, and transparent decision rules. Learned weights highlight the central role of market microstructure and historical features. Taken together, these results indicate that feature-driven RL offers a practical, data-efficient, and operationally deployable pathway for active intraday participation by PV producers.