🤖 AI Summary
Existing User and Entity Behavior Analytics (UEBA) frameworks suffer from heterogeneity between numerical and textual behavioral logs, leading to poor anomaly interpretability. Method: This paper proposes an interpretable threat detection method that integrates a deep autoencoder with Doc2Vec: Doc2Vec uniformly embeds textual logs into dense vectors, which are concatenated with numerical features and fed into a deep autoencoder for end-to-end joint reconstruction and anomaly scoring. Contribution/Results: It is the first work to jointly model multimodal behavioral data in UEBA and rigorously proves the theoretical equivalence of two definitions of fully connected networks. Evaluated on synthetic attack datasets and real enterprise logs, the method achieves high detection performance (F1 ≥ 0.92) and enables fine-grained interpretability by tracing reconstruction residuals to specific user/entity actions, generating actionable, human-readable alerts. The framework seamlessly integrates with enterprise SIEM systems.
📝 Abstract
User and Entity Behaviour Analytics (UEBA) is a broad branch of data analytics that attempts to build a normal behavioural profile in order to detect anomalous events. Among the techniques used to detect anomalies, Deep Autoencoders constitute one of the most promising deep learning models on UEBA tasks, allowing explainable detection of security incidents that could lead to the leak of personal data, hijacking of systems, or access to sensitive business information. In this study, we introduce the first implementation of an explainable UEBA-based anomaly detection framework that leverages Deep Autoencoders in combination with Doc2Vec to process both numerical and textual features. Additionally, based on the theoretical foundations of neural networks, we offer a novel proof demonstrating the equivalence of two widely used definitions for fully-connected neural networks. The experimental results demonstrate the proposed framework capability to detect real and synthetic anomalies effectively generated from real attack data, showing that the models provide not only correct identification of anomalies but also explainable results that enable the reconstruction of the possible origin of the anomaly. Our findings suggest that the proposed UEBA framework can be seamlessly integrated into enterprise environments, complementing existing security systems for explainable threat detection.