🤖 AI Summary
Addressing the challenges of low prediction accuracy and poor generalizability under limited data for residential cooling load forecasting in temperate cities like London under global warming, this paper proposes a physics-informed lightweight GRU framework. The method innovatively integrates synthetic data generated by a physics-based model and introduces a daily-level interpolation-based data partitioning strategy to preserve both temporal continuity and sample randomness. Bayesian optimization is employed for automated hyperparameter tuning. Under constraints of scarce real-world measurements, the optimized daily-interpolated GRU model achieves RMSE = 2.22%, MAE = 0.87%, and R² = 0.9386 on the test set. It outperforms existing approaches in both prediction accuracy and cross-regional generalizability, thereby overcoming key bottlenecks in regional-scale time-series forecasting regarding precision and robustness.
📝 Abstract
In the context of global warming, even relatively cooler countries like the UK are experiencing a rise in cooling demand, particularly in southern regions such as London. This growing demand, especially during the summer months, presents significant challenges for energy management systems. Accurately predicting cooling demand in urban domestic buildings is essential for maintaining energy efficiency. This study introduces a generalised framework for developing high-resolution Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks using physical model-based summer cooling demand data. To maximise the predictive capability and generalisation ability of the models under limited data scenarios, four distinct data partitioning strategies were implemented, including the extrapolation, month-based interpolation, global interpolation, and day-based interpolation. Bayesian Optimisation (BO) was then applied to fine-tune the hyper-parameters, substantially improving the framework predictive accuracy. Results show that the day-based interpolation GRU model demonstrated the best performance due to its ability to retain both the data randomness and the time sequence continuity characteristics. This optimal model achieves a Root Mean Squared Error (RMSE) of 2.22%, a Mean Absolute Error (MAE) of 0.87%, and a coefficient of determination (R square) of 0.9386 on the test set. The generalisation ability of this framework was further evaluated by forecasting.