Regression-aware Continual Learning for Android Malware Detection

📅 2025-07-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In Android malware detection, continual learning (CL) introduces “security regression”—the re-emergence of previously detected malicious samples as false negatives after model updates—posing severe security risks. Method: This paper formally defines security regression for the first time and proposes a regression-aware continual learning framework. Its core innovation is a model-agnostic Positive Consistency Training (PCT) mechanism, which enforces prediction consistency on historical malicious samples during incremental updates via a regression-aware loss function. Results: Experiments on ELSA, Tesseract, and AZ-Class datasets demonstrate that the method significantly reduces security regression rates while maintaining high detection accuracy and strong evolutionary adaptability, thereby ensuring long-term reliability of malware detection systems.

Technology Category

Application Category

📝 Abstract
Malware evolves rapidly, forcing machine learning (ML)-based detectors to adapt continuously. With antivirus vendors processing hundreds of thousands of new samples daily, datasets can grow to billions of examples, making full retraining impractical. Continual learning (CL) has emerged as a scalable alternative, enabling incremental updates without full data access while mitigating catastrophic forgetting. In this work, we analyze a critical yet overlooked issue in this context: security regression. Unlike forgetting, which manifests as a general performance drop on previously seen data, security regression captures harmful prediction changes at the sample level, such as a malware sample that was once correctly detected but evades detection after a model update. Although often overlooked, regressions pose serious risks in security-critical applications, as the silent reintroduction of previously detected threats in the system may undermine users' trust in the whole updating process. To address this issue, we formalize and quantify security regression in CL-based malware detectors and propose a regression-aware penalty to mitigate it. Specifically, we adapt Positive Congruent Training (PCT) to the CL setting, preserving prior predictive behavior in a model-agnostic manner. Experiments on the ELSA, Tesseract, and AZ-Class datasets show that our method effectively reduces regression across different CL scenarios while maintaining strong detection performance over time.
Problem

Research questions and friction points this paper is trying to address.

Addressing security regression in continual learning for malware detection
Mitigating harmful prediction changes in updated malware detectors
Preserving prior predictive behavior without full retraining
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts Positive Congruent Training for CL
Formalizes and quantifies security regression
Introduces regression-aware penalty in updates
🔎 Similar Papers
No similar papers found.
D
Daniele Ghiani
Department of Electrical and Electronic Engineering, University of Cagliari, Italy
D
Daniele Angioni
Department of Electrical and Electronic Engineering, University of Cagliari, Italy
G
Giorgio Piras
Department of Electrical and Electronic Engineering, University of Cagliari, Italy
Angelo Sotgiu
Angelo Sotgiu
Assistant Professor, University of Cagliari
L
Luca Minnei
Department of Electrical and Electronic Engineering, University of Cagliari, Italy
Srishti Gupta
Srishti Gupta
Indian Institute of Technology Patna
Natural Language ProcessingDialog SystemsHCIText Generation
Maura Pintor
Maura Pintor
University of Cagliari
Machine LearningAdversarial Machine LearningComputer Security
Fabio Roli
Fabio Roli
Professor, University of Genova and Cagliari, Italy
Pattern recognitionmachine learningcomputer visioncomputer security
Battista Biggio
Battista Biggio
Professor of Computer Engineering, University of Cagliari, Italy
Adversarial Machine LearningAI SecurityMachine LearningComputer Security