🤖 AI Summary
Detecting novel stealthy attacks—such as Advanced Persistent Threats (APTs)—in cloud environments under few-shot conditions remains a significant challenge. To address this, this paper proposes a semiotics-inspired system provenance behavior modeling framework. It introduces semiotics into operating-system-level event semantic extraction for the first time, enabling generalizable event semantic representation. Crucially, it reformulates anomaly detection as a cross-event semantic similarity computation task, thereby supporting zero-shot and few-shot identification of previously unseen attacks. The method integrates system provenance analysis, behavioral log embedding, and semantic similarity measurement. Trained with only 3–5 samples per attack class, it achieves over 92% accuracy in detecting unknown attacks on real-world cloud infrastructure—substantially outperforming conventional supervised learning approaches.
📝 Abstract
In recent years, the adoption of cloud services has been expanding at an unprecedented rate. As more and more organizations migrate or deploy their businesses to the cloud, a multitude of related cybersecurity incidents such as data breaches are on the rise. Many inherent attributes of cloud environments, for example, data sharing, remote access, dynamicity and scalability, pose significant challenges for the protection of cloud security. Even worse, cyber threats are becoming increasingly sophisticated and covert. Attack methods, such as Advanced Persistent Threats (APTs), are continually developed to bypass traditional security measures. Among the emerging technologies for robust threat detection, system provenance analysis is being considered as a promising mechanism, thus attracting widespread attention in the field of incident response. This paper proposes a new few-shot learning-based attack detection with improved data context intelligence. We collect operating system behavior data of cloud systems during realistic attacks and leverage an innovative semiotics extraction method to describe system events. Inspired by the advances in semantic analysis, which is a fruitful area focused on understanding natural languages in computational linguistics, we further convert the anomaly detection problem into a similarity comparison problem. Comprehensive experiments show that the proposed approach is able to generalize over unseen attacks and make accurate predictions, even if the incident detection models are trained with very limited samples.