RMSL: Weakly-Supervised Insider Threat Detection with Robust Multi-sphere Learning

📅 2025-08-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In insider threat detection, the absence of behavior-level fine-grained annotations leads to high false positive and false negative rates in unsupervised methods. To address this, we propose a robust multi-hypersphere learning framework that leverages sequence-level weak labels. Our method integrates multi-instance learning with an adaptive debiased self-training mechanism: (1) initializing normal behavior modeling via a one-class classifier; (2) constructing a multi-hypersphere structure to capture heterogeneous normal patterns; and (3) dynamically selecting high-confidence behavioral samples based on prediction confidence to iteratively refine behavior-level pseudo-labels and correct labeling bias. To our knowledge, this is the first approach capable of effectively disentangling and distinguishing normal from anomalous behavior patterns without behavior-level ground-truth annotations. Experiments on multiple benchmark datasets demonstrate significant improvements in behavior-level detection accuracy, reducing false positive and false negative rates by 18.7% and 23.4%, respectively.

Technology Category

Application Category

📝 Abstract
Insider threat detection aims to identify malicious user behavior by analyzing logs that record user interactions. Due to the lack of fine-grained behavior-level annotations, detecting specific behavior-level anomalies within user behavior sequences is challenging. Unsupervised methods face high false positive rates and miss rates due to the inherent ambiguity between normal and anomalous behaviors. In this work, we instead introduce weak labels of behavior sequences, which have lower annotation costs, i.e., the training labels (anomalous or normal) are at sequence-level instead of behavior-level, to enhance the detection capability for behavior-level anomalies by learning discriminative features. To achieve this, we propose a novel framework called Robust Multi-sphere Learning (RMSL). RMSL uses multiple hyper-spheres to represent the normal patterns of behaviors. Initially, a one-class classifier is constructed as a good anomaly-supervision-free starting point. Building on this, using multiple instance learning and adaptive behavior-level self-training debiasing based on model prediction confidence, the framework further refines hyper-spheres and feature representations using weak sequence-level labels. This approach enhances the model's ability to distinguish between normal and anomalous behaviors. Extensive experiments demonstrate that RMSL significantly improves the performance of behavior-level insider threat detection.
Problem

Research questions and friction points this paper is trying to address.

Detect insider threats using weakly-labeled behavior sequences
Reduce false positives in unsupervised anomaly detection
Improve behavior-level anomaly discrimination with multi-sphere learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses multiple hyper-spheres for normal behavior patterns
Combines multi-instance learning with self-training debiasing
Leverages weak sequence-level labels for anomaly detection
🔎 Similar Papers
No similar papers found.
Y
Yang Wang
Nankai University
Y
Yaxin Zhao
Nankai University
X
Xinyu Jiao
Nankai University
Sihan Xu
Sihan Xu
Ph.D. Student, University of Michigan
AI
Xiangrui Cai
Xiangrui Cai
Nankai University
Healthcare AITime Series AnalysisAI Safety
Y
Ying Zhang
Nankai University
X
Xiaojie Yuan
Nankai University