🤖 AI Summary
This study addresses the limitation of conventional symmetric error metrics in power load forecasting, which fail to capture operational safety risks—particularly underestimation that may lead to supply shortages. To bridge this gap, the authors propose a safety-oriented evaluation framework featuring asymmetric metrics, including asymmetric MAPE, underestimation rate, and 99.5th percentile reserve margin, thereby exposing the misalignment between standard accuracy measures and actual grid risk. Furthermore, they introduce S-Mamba, a weather-aware state-space model designed to enhance prediction reliability under extreme conditions. Experimental results demonstrate that S-Mamba achieves a 99.5th percentile tail-risk reserve margin of only 14.12%, significantly outperforming iTransformer’s 16.66%, and thus confirming its superior capability in safeguarding grid security.
📝 Abstract
Accurate grid load forecasting is safety-critical: under-predictions risk supply shortfalls, while symmetric error metrics can mask this operational asymmetry. We introduce an operator-legible evaluation framework -- Under-Prediction Rate (UPR), tail Reserve$_{99.5}^{\%}$ requirements, and explicit inflation diagnostics (Bias$_{24h}$/OPR) -- to quantify one-sided reliability risk beyond MAPE. Using this framework, we evaluate state space models (Mamba variants) and strong baselines on a weather-aligned California Independent System Operator (CAISO) dataset spanning Nov 2023--Nov 2025 (84,498 hourly records across 5 regional transmission areas) under a rolling-origin walk-forward backtest. We develop and evaluate thermal-lag-aligned weather fusion strategies for these architectures. Our results demonstrate that standard accuracy metrics are insufficient proxies for operational safety: models with comparable MAPE can imply materially different tail reserve requirements (Reserve$_{99.5}^{\%}$). We show that explicit weather integration narrows error distributions, reducing the impact of temperature-driven demand spikes. Furthermore, while probabilistic calibration reduces large-error events, it can induce systematic schedule inflation. We introduce Bias/OPR-constrained objectives to enable auditable trade-offs between minimizing tail risk and preventing trivial over-forecasting.