Using Large Language Models for Template Detection from Security Event Logs

📅 2024-09-08

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses the critical yet underexplored problem of unsupervised log template mining from security incident logs—specifically, leveraging large language models (LLMs) in a zero-shot, fully unsupervised setting without labeled data or manual rules. We propose a lightweight fine-tuning framework that integrates semantic clustering, dynamic template abstraction, and log-structure priors to guide template extraction. Our method avoids reliance on handcrafted heuristics or supervised signals while preserving interpretability and efficiency. Evaluated across multiple real-world security log datasets, it achieves 92.1% template accuracy—outperforming state-of-the-art unsupervised baselines by an average of 11.3%. Moreover, it significantly improves downstream tasks, including alert compression and anomaly detection. By transcending the limitations of conventional clustering- and regex-based approaches, this work establishes a reproducible, generalizable, LLM-driven unsupervised paradigm for log understanding.

Technology Category

Application Category

📝 Abstract

In modern IT systems and computer networks, real-time and offline event log analysis is a crucial part of cyber security monitoring. In particular, event log analysis techniques are essential for the timely detection of cyber attacks and for assisting security experts with the analysis of past security incidents. The detection of line patterns or templates from unstructured textual event logs has been identified as an important task of event log analysis since detected templates represent event types in the event log and prepare the logs for downstream online or offline security monitoring tasks. During the last two decades, a number of template mining algorithms have been proposed. However, many proposed algorithms rely on traditional data mining techniques, and the usage of Large Language Models (LLMs) has received less attention so far. Also, most approaches that harness LLMs are supervised, and unsupervised LLM-based template mining remains an understudied area. The current paper addresses this research gap and investigates the application of LLMs for unsupervised detection of templates from unstructured security event logs.

Problem

Research questions and friction points this paper is trying to address.

Detecting templates from unstructured security event logs

Applying LLMs for unsupervised template mining

Improving event log analysis for cybersecurity monitoring

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using LLMs for template detection

Unsupervised mining from event logs

Detecting patterns in security logs

🔎 Similar Papers

No similar papers found.