Adversarial Evasion in Non-Stationary Malware Detection: Minimizing Drift Signals through Similarity-Constrained Perturbations

📅 2026-04-23

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the vulnerability of existing adversarial malware in non-stationary environments, where distributional shifts induced by perturbations can reveal malicious intent. To enhance stealth, the authors propose incorporating a distributional similarity constraint into the adversarial example generation process. Specifically, they formulate a targeted attack in a normalized feature space and employ ℓ² regularization to balance the misclassification objective against the minimization of detectable distributional drift. The method integrates constrained optimization with multi-metric drift detection, significantly reducing drift signals in model outputs. Experimental results demonstrate that the similarity constraint effectively suppresses detectability, with ℓ² regularization yielding optimal performance; however, while increasing the perturbation budget improves attack success rates, it concurrently amplifies drift signals, revealing a trade-off between efficacy and stealth.

Technology Category

Application Category

📝 Abstract

Deep learning has emerged as a powerful approach for malware detection, demonstrating impressive accuracy across various data representations. However, these models face critical limitations in real-world, non-stationary environments where both malware characteristics and detection systems continuously evolve. Our research investigates a fundamental security question: Can an attacker generate adversarial malware samples that simultaneously evade classification and remain inconspicuous to drift monitoring mechanisms? We propose a novel approach that generates targeted adversarial examples in the classifier's standardized feature space, augmented with sophisticated similarity regularizers. By carefully constraining perturbations to maintain distributional similarity with clean malware, we create an optimization objective that balances targeted misclassification with drift signal minimization. We quantify the effectiveness of this approach by comprehensively comparing classifier output probabilities using multiple drift metrics. Our experiments demonstrate that similarity constraints can reduce output drift signals, with $\ell_2$ regularization showing the most promising results. We observe that perturbation budget significantly influences the evasion-detectability trade-off, with increased budget leading to higher attack success rates and more substantial drift indicators.

Problem

Research questions and friction points this paper is trying to address.

adversarial evasion

non-stationary

malware detection

drift signals

distributional similarity

Innovation

Methods, ideas, or system contributions that make the work stand out.

adversarial evasion

non-stationary environments

similarity-constrained perturbations