Model-agnostic clean-label backdoor mitigation in cybersecurity environments

📅 2024-07-11

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

Detecting clean-label backdoor attacks—where attackers poison training samples without altering their labels—remains a critical challenge in cybersecurity. This paper proposes a general, model-agnostic defense framework. Our method introduces a novel suspicious cluster isolation mechanism based on feature subspace density clustering and iterative anomaly scoring, requiring neither prior knowledge of triggers, model interpretability assumptions, nor a clean validation set. By adaptively selecting discriminative subspaces and leveraging density-driven anomaly detection, it effectively identifies and isolates poisoned samples. Evaluated on network traffic and malware classification tasks, the approach demonstrates strong robustness against two representative clean-label attack variants: accuracy degradation remains below 2%, and the framework seamlessly supports heterogeneous models—including gradient-boosted trees and neural networks—without architectural modification.

Technology Category

Application Category

📝 Abstract

The training phase of machine learning models is a delicate step, especially in cybersecurity contexts. Recent research has surfaced a series of insidious training-time attacks that inject backdoors in models designed for security classification tasks without altering the training labels. With this work, we propose new techniques that leverage insights in cybersecurity threat models to effectively mitigate these clean-label poisoning attacks, while preserving the model utility. By performing density-based clustering on a carefully chosen feature subspace, and progressively isolating the suspicious clusters through a novel iterative scoring procedure, our defensive mechanism can mitigate the attacks without requiring many of the common assumptions in the existing backdoor defense literature. To show the generality of our proposed mitigation, we evaluate it on two clean-label model-agnostic attacks on two different classic cybersecurity data modalities: network flows classification and malware classification, using gradient boosting and neural network models.

Problem

Research questions and friction points this paper is trying to address.

Mitigate clean-label backdoor attacks in cybersecurity models

Preserve model utility while defending against poisoning

Apply defense to network flow and malware classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Density-based clustering in feature subspace

Novel iterative scoring for suspicious clusters

Model-agnostic defense for clean-label attacks

🔎 Similar Papers

Unified Neural Backdoor Removal with Only Few Clean Samples through Unlearning and Relearning