🤖 AI Summary
To address the limited availability of ground-measured data for short-term power forecasting of global land-based photovoltaic (PV) power plants—which constrains model performance—this paper proposes a two-stage transfer learning framework: pretraining on large-scale synthetic irradiance and power data generated by PVGIS, followed by fine-tuning on scarce real-world measurements. The method integrates multivariate time-series modeling with a hybrid LSTM-Transformer architecture to enhance cross-regional and cross-climatic generalization. Empirical evaluation across hundreds of PV plants in the Netherlands, Australia, and Belgium demonstrates that, under small-sample conditions (<30 days of observed data), the proposed model reduces mean prediction error by 22.7% relative to baseline models. All code and pretrained models are publicly released, enabling plug-and-play deployment for any PV system with accessible simulated and observational data. This work establishes a scalable, robust paradigm for renewable energy forecasting in data-scarce settings.
📝 Abstract
Deep learning models have gained increasing prominence in recent years in the field of solar pho-tovoltaic (PV) forecasting. One drawback of these models is that they require a lot of high-quality data to perform well. This is often infeasible in practice, due to poor measurement infrastructure in legacy systems and the rapid build-up of new solar systems across the world. This paper proposes SolNet: a novel, general-purpose, multivariate solar power forecaster, which addresses these challenges by using a two-step forecasting pipeline which incorporates transfer learning from abundant synthetic data generated from PVGIS, before fine-tuning on observational data. Using actual production data from hundreds of sites in the Netherlands, Australia and Belgium, we show that SolNet improves forecasting performance over data-scarce settings as well as baseline models. We find transfer learning benefits to be the strongest when only limited observational data is available. At the same time we provide several guidelines and considerations for transfer learning practitioners, as our results show that weather data, seasonal patterns, amount of synthetic data and possible mis-specification in source location, can have a major impact on the results. The SolNet models created in this way are applicable for any land-based solar photovoltaic system across the planet where simulated and observed data can be combined to obtain improved forecasting capabilities.