Ranking-enhanced anomaly detection using Active Learning-assisted Attention Adversarial Dual AutoEncoder

๐Ÿ“… 2025-11-24
๐Ÿ›๏ธ Scientific Reports
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Advanced Persistent Threat (APT) detection suffers from extreme scarcity of positive samples (as low as 0.004%), rendering conventional supervised learning infeasible. Method: This paper proposes the Attention-based Adversarial Dual Autoencoder (AADA) framework, integrating unsupervised anomaly modeling with uncertainty-driven active learning. AADA employs dual autoencoders to jointly model normal behavioral distributions; incorporates attention mechanisms to enhance discriminative feature representation; adopts adversarial training to improve robustness; and introduces a ranking-enhanced active querying strategy to efficiently identify the most informative samples for human annotation. Contribution/Results: Evaluated on the DARPA multi-platform provenance dataset, AADA significantly outperforms state-of-the-art methods under extremely limited labeling budgetsโ€”achieving markedly higher APT detection rates while drastically reducing annotation effort. It provides a scalable, end-to-end solution for high-accuracy APT detection in label-scarce scenarios.

Technology Category

Application Category

๐Ÿ“ Abstract
Advanced Persistent Threats (APTs) pose a significant challenge in cybersecurity due to their stealthy and long-term nature. Modern supervised learning methods require extensive labeled data, which is often scarce in real-world cybersecurity environments. In this paper, we propose an innovative approach that leverages AutoEncoders for unsupervised anomaly detection, augmented by active learning to iteratively improve the detection of APT anomalies. By selectively querying an oracle for labels on uncertain or ambiguous samples, we minimize labeling costs while improving detection rates, enabling the model to improve its detection accuracy with minimal data while reducing the need for extensive manual labeling. We provide a detailed formulation of the proposed Attention Adversarial Dual AutoEncoder-based anomaly detection framework and show how the active learning loop iteratively enhances the model. The framework is evaluated on real-world imbalanced provenance trace databases produced by the DARPA Transparent Computing program, where APT-like attacks constitute as little as 0.004% of the data. The datasets span multiple operating systems, including Android, Linux, BSD, and Windows, and cover two attack scenarios. The results have shown significant improvements in detection rates during active learning and better performance compared to other existing approaches.
Problem

Research questions and friction points this paper is trying to address.

Detecting stealthy Advanced Persistent Threats in cybersecurity
Reducing reliance on labeled data through unsupervised learning
Improving anomaly detection accuracy with minimal manual labeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

AutoEncoders for unsupervised anomaly detection
Active learning to minimize labeling costs
Attention Adversarial Dual AutoEncoder framework
๐Ÿ”Ž Similar Papers
No similar papers found.