🤖 AI Summary
To address online probabilistic forecasting for large-scale streaming data, this paper proposes an incremental learning method that integrates online LASSO with the Generalized Additive Models for Location, Scale, and Shape (GAMLSS) framework—enabling, for the first time, real-time updating of regularized conditional distribution models capturing heteroscedasticity and higher-order moments. The method leverages online gradient optimization coupled with sparse regularization, achieving both statistical interpretability and substantial computational efficiency gains. Evaluated on day-ahead electricity price forecasting, it achieves state-of-the-art probabilistic forecast accuracy while reducing training time by over 80%, supporting millisecond-level dynamic calibration and industrial-grade real-time deployment. Key contributions include: (1) the first scalable online GAMLSS framework; (2) joint sparse estimation and progressive updating of distribution parameters; and (3) an open-source, high-performance Python implementation balancing modeling flexibility with engineering practicality.
📝 Abstract
Large-scale streaming data are common in modern machine learning applications and have led to the development of online learning algorithms. Many fields, such as supply chain management, weather and meteorology, energy markets, and finance, have pivoted towards using probabilistic forecasts, which yields the need not only for accurate learning of the expected value but also for learning the conditional heteroskedasticity and conditional distribution moments. Against this backdrop, we present a methodology for online estimation of regularized, linear distributional models. The proposed algorithm is based on a combination of recent developments for the online estimation of LASSO models and the well-known GAMLSS framework. We provide a case study on day-ahead electricity price forecasting, in which we show the competitive performance of the incremental estimation combined with strongly reduced computational effort. Our algorithms are implemented in a computationally efficient Python package.