🤖 AI Summary
Smart meter time-series data frequently exhibit multi-scale missingness (30 minutes to 1 day) due to sensor failures, leading to biased analysis and inaccurate forecasting. This study presents the first systematic benchmark evaluating statistical imputation methods (linear/spline interpolation), traditional machine learning models (XGBoost, Random Forest), general-purpose large language models (LLaMA-2, Phi-3), and dedicated time-series foundation models (TimesNet, iTransformer, DLinear, etc.) for electricity load data imputation. Results show that time-series foundation models achieve an average 18.7% reduction in MAE under long-horizon missingness, significantly outperforming all baselines. Moreover, we identify a Pareto trade-off between their superior contextual modeling capability and increased computational cost—specifically, 5–12× higher inference latency. These findings establish an empirical benchmark and provide actionable design guidelines for deploying generative AI in trustworthy energy data imputation.
📝 Abstract
The integrity of time series data in smart grids is often compromised by missing values due to sensor failures, transmission errors, or disruptions. Gaps in smart meter data can bias consumption analyses and hinder reliable predictions, causing technical and economic inefficiencies. As smart meter data grows in volume and complexity, conventional techniques struggle with its nonlinear and nonstationary patterns. In this context, Generative Artificial Intelligence offers promising solutions that may outperform traditional statistical methods. In this paper, we evaluate two general-purpose Large Language Models and five Time Series Foundation Models for smart meter data imputation, comparing them with conventional Machine Learning and statistical models. We introduce artificial gaps (30 minutes to one day) into an anonymized public dataset to test inference capabilities. Results show that Time Series Foundation Models, with their contextual understanding and pattern recognition, could significantly enhance imputation accuracy in certain cases. However, the trade-off between computational cost and performance gains remains a critical consideration.