Kill Two Birds with One Stone! Trajectory enabled Unified Online Detection of Adversarial Examples and Backdoor Attacks

📅 2025-06-27

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the fragmentation between online detection of adversarial examples and backdoor attacks. We propose UniGuard, the first unified real-time detection framework for both threats. Its core innovation lies in modeling temporal trajectory discrepancies of layer-wise activations during forward propagation: an LSTM captures the dynamic evolution of neuron responses, while spectral transformation enhances separability of subtle anomalous signals. UniGuard unifies detection of these two threat classes under a lightweight, online runtime paradigm—requiring no modification to the victim model or access to training data. Evaluated across multimodal inputs (images, text, audio) and diverse tasks (classification, regression), it consistently outperforms state-of-the-art methods including ContraNet and TED. UniGuard demonstrates strong robustness against static/dynamic backdoors and black-box attacks, achieves low detection latency, and enables seamless deployment.

Technology Category

Application Category

📝 Abstract

The proposed UniGuard is the first unified online detection framework capable of simultaneously addressing adversarial examples and backdoor attacks. UniGuard builds upon two key insights: first, both AE and backdoor attacks have to compromise the inference phase, making it possible to tackle them simultaneously during run-time via online detection. Second, an adversarial input, whether a perturbed sample in AE attacks or a trigger-carrying sample in backdoor attacks, exhibits distinctive trajectory signatures from a benign sample as it propagates through the layers of a DL model in forward inference. The propagation trajectory of the adversarial sample must deviate from that of its benign counterpart; otherwise, the adversarial objective cannot be fulfilled. Detecting these trajectory signatures is inherently challenging due to their subtlety; UniGuard overcomes this by treating the propagation trajectory as a time-series signal, leveraging LSTM and spectrum transformation to amplify differences between adversarial and benign trajectories that are subtle in the time domain. UniGuard exceptional efficiency and effectiveness have been extensively validated across various modalities (image, text, and audio) and tasks (classification and regression), ranging from diverse model architectures against a wide range of AE attacks and backdoor attacks, including challenging partial backdoors and dynamic triggers. When compared to SOTA methods, including ContraNet (NDSS 22) specific for AE detection and TED (IEEE SP 24) specific for backdoor detection, UniGuard consistently demonstrates superior performance, even when matched against each method's strengths in addressing their respective threats-each SOTA fails to parts of attack strategies while UniGuard succeeds for all.

Problem

Research questions and friction points this paper is trying to address.

Unified online detection of adversarial examples and backdoor attacks

Detecting subtle trajectory deviations in DL model propagation

Efficient cross-modal defense against diverse attack strategies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified online detection for adversarial and backdoor attacks

Leverages LSTM and spectrum transformation for trajectory analysis

Detects subtle trajectory deviations in time-series signals

🔎 Similar Papers

A Survey and Evaluation of Adversarial Attacks for Object Detection