🤖 AI Summary
This work identifies, for the first time, significant user-level and record-level membership inference attack (MIA) privacy risks against time-series forecasting models—a critical gap, as prior MIA research focuses predominantly on classification tasks and lacks systematic evaluation in forecasting settings.
Method: We propose LiRA-TS, a time-series-adapted variant of Likelihood Ratio Attack, and DTS (Deep Time Series), an end-to-end MIA framework tailored to forecasting architectures. Experiments are conducted on TUH-EEG and ELD datasets using mainstream models including LSTM and N-HiTS.
Contribution/Results: We establish the first dedicated privacy benchmark for time-series forecasting models. Results reveal that longer prediction horizons and smaller training cohorts substantially amplify privacy leakage. DTS achieves near-perfect (≈100%) accuracy in user-level membership inference—significantly outperforming adapted state-of-the-art classification-based MIA baselines. This work provides foundational risk characterization and standardized evaluation tools for trustworthy time-series AI.
📝 Abstract
Membership inference attacks (MIAs) aim to determine whether specific data were used to train a model. While extensively studied on classification models, their impact on time series forecasting remains largely unexplored. We address this gap by introducing two new attacks: (i) an adaptation of multivariate LiRA, a state-of-the-art MIA originally developed for classification models, to the time-series forecasting setting, and (ii) a novel end-to-end learning approach called Deep Time Series (DTS) attack. We benchmark these methods against adapted versions of other leading attacks from the classification setting.
We evaluate all attacks in realistic settings on the TUH-EEG and ELD datasets, targeting two strong forecasting architectures, LSTM and the state-of-the-art N-HiTS, under both record- and user-level threat models. Our results show that forecasting models are vulnerable, with user-level attacks often achieving perfect detection. The proposed methods achieve the strongest performance in several settings, establishing new baselines for privacy risk assessment in time series forecasting. Furthermore, vulnerability increases with longer prediction horizons and smaller training populations, echoing trends observed in large language models.