Hedging Memory Horizons for Non-Stationary Prediction via Online Aggregation

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

161K/year

🤖 AI Summary

This work addresses the fundamental trade-off between stability and adaptability in online forecasting under unknown distribution shifts. The authors propose MELO, a novel approach that treats memory length as an uncertainty dimension to be hedged. MELO constructs adaptive experts via exponential weighting with multi-scale forgetting factors within an exponentially weighted least squares (EWLS) framework and combines their predictions with the base forecaster using the parameter-free online aggregation rule MLpol. Without requiring any preset memory horizon or external side information, MELO automatically balances stability and adaptability, achieving theoretical guarantees that simultaneously approach the performance of both the best fixed predictor and the best time-varying affine combination. Evaluated on French electricity load forecasting, MELO reduces RMSE by 34.7% over the baseline MLpol and outperforms the TabICL model—which leverages external covariates—without any retraining.

📝 Abstract

We study online prediction under distribution shift, where inputs arrive chronologically and outcomes are revealed only after prediction. In this setting, predictors must remain stable in quiet regimes yet adapt when regimes shift, and the right adaptation memory is unknown in advance. We propose MELO (Memory-hedged Exponentially Weighted Least-Squares Online aggregation), a model-agnostic method that hedges across adaptation scales: it wraps any non-anticipating base-predictor pool with exponentially weighted least-squares (EWLS) adaptation experts at multiple forgetting factors, and aggregates raw and EWLS-adapted forecasts with MLpol, a parameter-free online aggregation rule. Under boundedness conditions, we establish deterministic oracle inequalities showing that it competes with both the best raw predictor and the best bounded, time-varying affine combinations of the base predictions, up to a path-length-dependent tracking cost and a sublinear aggregation overhead. We evaluate MELO on French national electricity-load forecasting through the COVID-19 lockdown using no regime indicators, lockdown dates, or policy covariates. MELO reduces overall RMSE by 34.7\% relative to base-only MLpol and achieves lower overall RMSE than a TabICL reference supplied with an external COVID policy-response covariate. Moreover, MELO requires only lightweight per-step recursive updates without model retraining.

Problem

Research questions and friction points this paper is trying to address.

online prediction

distribution shift

adaptation memory

non-stationary

forecasting

Innovation

Methods, ideas, or system contributions that make the work stand out.

online aggregation

distribution shift

exponentially weighted least squares