Evolution of Log-Based Detection Rules in Public Repositories

📅 2026-05-06

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This study addresses the lack of empirical analysis on the long-term evolution of log detection rules in public repositories. It proposes, for the first time, a predicate graph–based method to normalize rule logic structures and applies tree alignment algorithms, large language model inference, and manual validation to longitudinally track 6,859 historical versions of rules from Sigma and Splunk Security Content. The findings reveal that 56% of rules underwent at least one logical modification, with more than half simultaneously adding and removing clauses. Approximately one-quarter to one-third of rules exhibited repeated adjustments between broadening detection coverage and reducing false positives, demonstrating significant non-monotonicity and frequent rollbacks in rule evolution—challenging the conventional assumption that detection rules gradually converge over time.

📝 Abstract

Log-based detection rules remain central to modern security operations, encoding domain expertise that analysts iteratively refine to balance detection coverage against alert volume. Yet while prior work has examined the evolution of network intrusion detection signatures, the longitudinal behavior of log-based detection rules has received little empirical study. We present the first longitudinal analysis of detection rule evolution across two widely used repositories: the community-driven Sigma project and the curated Splunk Security Content (SSC). To compare rule versions based on detection logic rather than surface syntax, we introduce a predicate graph intermediate representation that canonicalizes the logical structure of a rule, together with a tree alignment procedure for analyzing changes across revisions. We apply this method to 6,859 rule histories from Sigma and SSC and find that roughly 56% of rules undergo at least one revision on detection logic. Across rule lifetimes, evolution is predominantly non-monotonic, with over half of rules both adding and removing clauses over time. We further observe recurring reversions, indicating that changes are often revisited rather than strictly accumulated. Combining structural analysis with LLM-based inference and human validation of operational intent shows that roughly a quarter to a third of rules alternate between expanding coverage and reducing false positives, rather than converging toward a stable form. Together, these results reveal that detection rule evolution in public repositories reflects ongoing operational trade-offs rather than steady convergence. Our study raises questions about why rules change the way they do and supports research towards better processes for devising and deploying security rules.

Problem

Research questions and friction points this paper is trying to address.

log-based detection rules

rule evolution

security operations

false positives

detection coverage

Innovation

Methods, ideas, or system contributions that make the work stand out.

predicate graph

tree alignment

detection rule evolution