Small is Beautiful: A Practical and Efficient Log Parsing Framework

📅 2026-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the significant performance limitations of small-scale large language models (LLMs) in log parsing, which struggle to balance data privacy and computational efficiency. To overcome these challenges, we propose EFParser, an unsupervised log parsing framework tailored for small LLMs. EFParser introduces an adaptive dual-buffer mechanism to effectively distinguish between known and emerging log patterns and incorporates a template correction module to prevent erroneous template injection, thereby enhancing parsing accuracy and robustness. Experimental results demonstrate that EFParser outperforms the current state-of-the-art methods by an average of 12.5% across multiple public large-scale datasets, even surpassing several baselines that rely on larger models, while maintaining high inference efficiency.

Technology Category

Application Category

📝 Abstract
Log parsing is a fundamental step in log analysis, partitioning raw logs into constant templates and dynamic variables. While recent semantic-based parsers leveraging Large Language Models (LLMs) exhibit superior generalizability over traditional syntax-based methods, their effectiveness is heavily contingent on model scale. This dependency leads to significant performance collapse when employing smaller, more resource-efficient LLMs. Such degradation creates a major barrier to real-world adoption, where data privacy requirements and computational constraints necessitate the use of succinct models. To bridge this gap, we propose EFParser, an unsupervised LLM-based log parser designed to enhance the capabilities of smaller models through systematic architectural innovation. EFParser introduces a dual-cache system with an adaptive updating mechanism that distinguishes between novel patterns and variations of existing templates. This allows the parser to merge redundant templates and rectify prior errors, maintaining cache consistency. Furthermore, a dedicated correction module acts as a gatekeeper, validating and refining every LLM-generated template before caching to prevent error injection. Empirical evaluations on public large-scale datasets demonstrate that EFParser outperforms state-of-the-art baselines by an average of 12.5% across all metrics when running on smaller LLMs, even surpassing some baselines utilizing large-scale models. Despite its additional validation steps, EFParser maintains high computational efficiency, offering a robust and practical solution for real-world log analysis deployment.
Problem

Research questions and friction points this paper is trying to address.

log parsing
large language models
small models
computational efficiency
data privacy
Innovation

Methods, ideas, or system contributions that make the work stand out.

log parsing
small LLMs
dual-cache system
template correction
unsupervised parsing
🔎 Similar Papers
No similar papers found.
M
Minxing Wang
Singapore Management University, Singapore
Yintong Huo
Yintong Huo
Singapore Management University
AI4SEAIOpsLog analysisMLLM for SE