🤖 AI Summary
Industrial robots in smart manufacturing face stealthy data integrity attacks, which existing intrusion detection and model-driven approaches struggle to identify effectively. This paper proposes ViSTR-GP, an online detection framework that uniquely integrates external visual sensing—via SAM-Track–based interactive segmentation to obtain joint-level visual masks—with encoder-measured kinematic data. It constructs a low-rank tensor regression surrogate model of robot state and employs matrix-variate Gaussian processes to capture spatiotemporal correlations in residuals, enabling interpretable, adaptive threshold generation. Detection is performed via frame-level statistical hypothesis testing fusing heterogeneous multimodal signals. Evaluated on a real robotic platform, ViSTR-GP significantly improves detection sensitivity: it increases detection rate for subtle attacks by 32.7% over baselines, achieves an average alarm lead time of 1.8 seconds, and reduces joint angle reconstruction error by 41.5%.
📝 Abstract
Industrial robotic systems are central to automating smart manufacturing operations. Connected and automated factories face growing cybersecurity risks that can potentially cause interruptions and damages to physical operations. Among these attacks, data-integrity attacks often involve sophisticated exploitation of vulnerabilities that enable an attacker to access and manipulate the operational data and are hence difficult to detect with only existing intrusion detection or model-based detection. This paper addresses the challenges in utilizing existing side-channels to detect data-integrity attacks in robotic manufacturing processes by developing an online detection framework, ViSTR-GP, that cross-checks encoder-reported measurements against a vision-based estimate from an overhead camera outside the controller's authority. In this framework, a one-time interactive segmentation initializes SAM-Track to generate per-frame masks. A low-rank tensor-regression surrogate maps each mask to measurements, while a matrix-variate Gaussian process models nominal residuals, capturing temporal structure and cross-joint correlations. A frame-wise test statistic derived from the predictive distribution provides an online detector with interpretable thresholds. We validate the framework on a real-world robotic testbed with synchronized video frame and encoder data, collecting multiple nominal cycles and constructing replay attack scenarios with graded end-effector deviations. Results on the testbed indicate that the proposed framework recovers joint angles accurately and detects data-integrity attacks earlier with more frequent alarms than all baselines. These improvements are most evident in the most subtle attacks. These results show that plants can detect data-integrity attacks by adding an independent physical channel, bypassing the controller's authority, without needing complex instrumentation.