🤖 AI Summary
This work addresses the temporal bias in Android malware detection caused by ignoring app release timestamps. To mitigate this issue, the authors propose a time-aware and obfuscation-resilient detection framework. They construct a large-scale dataset annotated with precise timestamps and incorporate a temporal validation mechanism to ensure chronological consistency during evaluation. Notably, they pioneer the integration of BYOL-based self-supervised pretraining with supervised classification to learn robust, temporally coherent feature representations. Experimental results demonstrate that the proposed approach achieves 98% accuracy and an 89% F1 score under a time-aware evaluation protocol. Furthermore, the study provides in-depth analysis of malicious behaviors through cross-referencing with VirusTotal and MITRE ATT&CK. The dataset and source code have been publicly released to foster reproducibility and future research.
📝 Abstract
Android malware detectors built with machine learning often suffer from temporal bias: models are trained and evaluated without respecting apps' actual release times, inflating accuracy and weakening real-world robustness. We address this by constructing a time-stamped dataset of benign and malicious Android apps and introducing a timestamp-verification procedure to ensure temporal accuracy. We then propose a detection framework that uses Bootstrap Your Own Latent (BYOL) for self-supervised pre-training to learn obfuscation-resilient representations, followed by supervised classification. Under time-aware evaluation, the method attains 98% accuracy and 89% F1. We further characterize malware behavior by analyzing true positives and false negatives using VirusTotal and the MITRE ATT&CK framework. To support reproducibility and further innovation, we release our dataset and source code.