Quantifying Robustness: A Benchmarking Framework for Deep Learning Forecasting in Cyber-Physical Systems

📅 2025-04-04

📈 Citations: 0

✨ Influential: 0

career value

239K/year

🤖 AI Summary

Deep learning (DL) predictive models in industrial cyber-physical systems (CPS) suffer from insufficient robustness, and existing evaluation methods fail to reflect realistic perturbation scenarios. Method: We propose a practical, distributionally robust definition and a systematic evaluation framework. For the first time, we model real-world perturbations—including sensor drift, measurement noise, and irregular sampling—and integrate time-series perturbation generation, distributional shift quantification, and comparative evaluation across diverse DL architectures (RNN, CNN, Transformer, SSM). We establish the first robustness benchmark featuring multi-source real-world CPS time-series data and supporting reproducible assessment. Contribution/Results: We introduce a standardized robustness scoring mechanism for industrial CPS, deliver a ranked robustness comparison of mainstream models with root-cause attribution analysis, and open-source a comprehensive toolchain that significantly enhances the reliability of model selection and architecture design.

Technology Category

Application Category

📝 Abstract

Cyber-Physical Systems (CPS) in domains such as manufacturing and energy distribution generate complex time series data crucial for Prognostics and Health Management (PHM). While Deep Learning (DL) methods have demonstrated strong forecasting capabilities, their adoption in industrial CPS remains limited due insufficient robustness. Existing robustness evaluations primarily focus on formal verification or adversarial perturbations, inadequately representing the complexities encountered in real-world CPS scenarios. To address this, we introduce a practical robustness definition grounded in distributional robustness, explicitly tailored to industrial CPS, and propose a systematic framework for robustness evaluation. Our framework simulates realistic disturbances, such as sensor drift, noise and irregular sampling, enabling thorough robustness analyses of forecasting models on real-world CPS datasets. The robustness definition provides a standardized score to quantify and compare model performance across diverse datasets, assisting in informed model selection and architecture design. Through extensive empirical studies evaluating prominent DL architectures (including recurrent, convolutional, attention-based, modular, and structured state-space models) we demonstrate the applicability and effectiveness of our approach. We publicly release our robustness benchmark to encourage further research and reproducibility.

Problem

Research questions and friction points this paper is trying to address.

Evaluating robustness of DL forecasting in CPS

Addressing lack of realistic robustness benchmarks

Standardizing robustness scoring for model comparison

Innovation

Methods, ideas, or system contributions that make the work stand out.

Defines robustness via distributional robustness for CPS

Simulates realistic disturbances for thorough model analysis

Provides standardized score for model comparison and selection

🔎 Similar Papers

Adversarial Attacks and Defenses in Multivariate Time-Series Forecasting for Smart and Connected Infrastructures