TED++: Submanifold-Aware Backdoor Detection via Layerwise Tubular-Neighbourhood Screening

📅 2025-10-16

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Detecting stealthy backdoor attacks in deep neural networks remains challenging—particularly under clean-data scarcity and distance-based adaptive attacks. To address this, we propose a submanifold-aware backdoor detection framework. Our method constructs layer-wise, class-specific tubular neighborhoods to characterize the geometric structure of hidden-layer features on class-specific submanifolds, and introduces a Local Adaptive Ranking (LAR) mechanism to quantify cross-layer input deviation. Robust anomaly detection is achieved via aggregation of multi-layer ranking sequences. The core innovation lies in integrating manifold geometric modeling with adaptive ranking, enabling significantly improved detection performance under extreme data constraints (e.g., only five clean samples per class) and against adaptive attacks. On multiple benchmarks, our method achieves up to a 14-percentage-point AUROC improvement over state-of-the-art approaches, approaching perfect detection.

Technology Category

Application Category

📝 Abstract

As deep neural networks power increasingly critical applications, stealthy backdoor attacks, where poisoned training inputs trigger malicious model behaviour while appearing benign, pose a severe security risk. Many existing defences are vulnerable when attackers exploit subtle distance-based anomalies or when clean examples are scarce. To meet this challenge, we introduce TED++, a submanifold-aware framework that effectively detects subtle backdoors that evade existing defences. TED++ begins by constructing a tubular neighbourhood around each class's hidden-feature manifold, estimating its local ``thickness'' from a handful of clean activations. It then applies Locally Adaptive Ranking (LAR) to detect any activation that drifts outside the admissible tube. By aggregating these LAR-adjusted ranks across all layers, TED++ captures how faithfully an input remains on the evolving class submanifolds. Based on such characteristic ``tube-constrained'' behaviour, TED++ flags inputs whose LAR-based ranking sequences deviate significantly. Extensive experiments are conducted on benchmark datasets and tasks, demonstrating that TED++ achieves state-of-the-art detection performance under both adaptive-attack and limited-data scenarios. Remarkably, even with only five held-out examples per class, TED++ still delivers near-perfect detection, achieving gains of up to 14% in AUROC over the next-best method. The code is publicly available at https://github.com/namle-w/TEDpp.

Problem

Research questions and friction points this paper is trying to address.

Detects subtle backdoor attacks in neural networks

Identifies malicious inputs using submanifold tubular neighborhoods

Works effectively with limited clean data samples

Innovation

Methods, ideas, or system contributions that make the work stand out.

Constructs tubular neighbourhoods around class manifolds

Applies Locally Adaptive Ranking for drift detection

Aggregates layerwise ranks to identify deviations

🔎 Similar Papers

Unified Neural Backdoor Removal with Only Few Clean Samples through Unlearning and Relearning