Evaluating the Robustness of a Production Malware Detection System to Transferable Adversarial Attacks

📅 2025-10-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study exposes transferable adversarial vulnerabilities in machine learning components of production-grade malware detection systems, using Gmail’s detection pipeline—and specifically its file-type identification model Magika—as a case study. We propose a hybrid black-box/white-box adversarial sample generation method that manipulates only 13 bytes to mislead Magika in 90% of cases, causing malicious files to be misclassified and routed away from subsequent security checks. This constitutes the first systematic demonstration of cascading failure risks arising from ML model degradation within end-to-end security chains. To mitigate this, we design and deploy a robustness-enhancing defense that reduces attack success rate to 20% and increases the minimum required perturbation to ≥50 bytes. The solution has been integrated into Gmail’s production environment, establishing a practical, deployable paradigm for enhancing adversarial robustness in ML-driven security systems.

Technology Category

Application Category

📝 Abstract
As deep learning models become widely deployed as components within larger production systems, their individual shortcomings can create system-level vulnerabilities with real-world impact. This paper studies how adversarial attacks targeting an ML component can degrade or bypass an entire production-grade malware detection system, performing a case study analysis of Gmail's pipeline where file-type identification relies on a ML model. The malware detection pipeline in use by Gmail contains a machine learning model that routes each potential malware sample to a specialized malware classifier to improve accuracy and performance. This model, called Magika, has been open sourced. By designing adversarial examples that fool Magika, we can cause the production malware service to incorrectly route malware to an unsuitable malware detector thereby increasing our chance of evading detection. Specifically, by changing just 13 bytes of a malware sample, we can successfully evade Magika in 90% of cases and thereby allow us to send malware files over Gmail. We then turn our attention to defenses, and develop an approach to mitigate the severity of these types of attacks. For our defended production model, a highly resourced adversary requires 50 bytes to achieve just a 20% attack success rate. We implement this defense, and, thanks to a collaboration with Google engineers, it has already been deployed in production for the Gmail classifier.
Problem

Research questions and friction points this paper is trying to address.

Evaluating robustness of malware detection to adversarial attacks
Studying ML component vulnerabilities in production security systems
Developing defenses against transferable adversarial evasion techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarial examples bypass malware detection system
Minimal byte changes achieve high evasion success rate
Defense deployed in production reduces attack effectiveness
🔎 Similar Papers
No similar papers found.