🤖 AI Summary
This study addresses the uncertainty inherent in time series model selection by proposing the Model Selection Confidence Set (MSCS) approach, which identifies, at a given confidence level, the set of models statistically indistinguishable from the true data-generating process and further extracts its most parsimonious subset (LBM). Moving beyond the conventional single-model selection paradigm, the method integrates autoregressive moving average modeling, statistical inference, and frequency-domain analysis to quantify the importance of individual model components. Empirical evaluation on Italian hourly electricity load data reveals substantial intraday variation in model uncertainty and demonstrates that MSCS effectively identifies a highly competitive set of short-term forecasting models incorporating key predictors such as hourly lags, temperature, calendar effects, and solar power generation.
📝 Abstract
This paper studies the Model Selection Confidence Set (MSCS) methodology for univariate time series models involving autoregressive and moving average components, and applies it to study model selection uncertainty in the Italian electricity load data. Rather than relying on a single model selected by an arbitrary criterion, the MSCS identifies a set of models that are statistically indistinguishable from the true data-generating process at a given confidence level. The size and composition of this set reveal crucial information about model selection uncertainty: noisy data scenarios produce larger sets with many candidate models, while more informative cases narrow the set considerably. To study the importance of each model term, we consider numerical statistics measuring the frequency with which each term is included in both the entire MSCS and in Lower Boundary Models (LBM), its most parsimonious specifications. Applied to Italian hourly electricity load data, the MSCS methodology reveals marked intraday variation in model selection uncertainty and isolates a collection of model specifications that deliver competitive short-term forecasts while highlighting key drivers of electricity load like intraday hourly lags, temperature, calendar effects and solar energy generation.