CORTEX: Collaborative LLM Agents for High-Stakes Alert Triage

📅 2025-09-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Security Operations Centers (SOCs) suffer from analyst fatigue and missed detections due to overwhelming alert volumes; existing large language model (LLM)-based approaches rely on monolithic end-to-end processing, failing to address enterprise log noise, contextual sparsity, and unverifiable decision-making. This paper proposes a multi-agent LLM architecture that decouples analysis into three specialized agents—behavioral analysis, evidence collection, and reasoning adjudication—collaboratively constructing auditable, chain-of-evidence reasoning for fine-grained high-risk alert classification. Key innovations include log sequence modeling, cross-system evidence retrieval, and structured logical inference. We also release the first fine-grained, real-world dataset tailored for SOC investigation tasks. Experiments demonstrate substantial false positive reduction, outperforming single-agent LLM baselines across diverse enterprise scenarios, while ensuring high accuracy, strong interpretability, and practical deployability.

Technology Category

Application Category

📝 Abstract
Security Operations Centers (SOCs) are overwhelmed by tens of thousands of daily alerts, with only a small fraction corresponding to genuine attacks. This overload creates alert fatigue, leading to overlooked threats and analyst burnout. Classical detection pipelines are brittle and context-poor, while recent LLM-based approaches typically rely on a single model to interpret logs, retrieve context, and adjudicate alerts end-to-end -- an approach that struggles with noisy enterprise data and offers limited transparency. We propose CORTEX, a multi-agent LLM architecture for high-stakes alert triage in which specialized agents collaborate over real evidence: a behavior-analysis agent inspects activity sequences, evidence-gathering agents query external systems, and a reasoning agent synthesizes findings into an auditable decision. To support training and evaluation, we release a dataset of fine-grained SOC investigations from production environments, capturing step-by-step analyst actions and linked tool outputs. Across diverse enterprise scenarios, CORTEX substantially reduces false positives and improves investigation quality over state-of-the-art single-agent LLMs.
Problem

Research questions and friction points this paper is trying to address.

Overwhelming volume of security alerts causing fatigue and missed threats
Brittle classical detection systems and opaque single-agent LLM approaches
Need for transparent, collaborative AI systems handling noisy enterprise data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent LLM architecture for alert triage
Specialized agents collaborate over real evidence
Synthesizes findings into auditable decision process
🔎 Similar Papers
No similar papers found.
B
Bowen Wei
George Mason University
Y
Yuan Shen Tay
Fluency Security
H
Howard Liu
Fluency Security
Jinhao Pan
Jinhao Pan
Ph.D. Student in Computer Science, George Mason University
LLMResponsible AIRecommender System
Kun Luo
Kun Luo
Zhejiang University
Z
Ziwei Zhu
George Mason University
C
Chris Jordan
Fluency Security