SCOPE: Tree-based Self-Correcting Online Log Parsing via Syntactic-Semantic Collaboration

📅 2026-03-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes an online self-correcting log parsing approach that bridges the gap between efficiency and accuracy in log analysis. While traditional methods are computationally efficient yet limited in accuracy, large language models (LLMs) offer high precision at substantial computational cost. To address this trade-off, the proposed method introduces a novel bidirectional tree structure to enhance template matching rates and employs a two-stage syntax–semantics collaboration mechanism: a lightweight NLP model driven by part-of-speech tagging performs initial syntactic filtering, followed by selective invocation of an LLM for semantic refinement only when necessary. This design drastically reduces LLM usage while maintaining high parsing accuracy. Experimental results across multiple benchmark datasets demonstrate that the framework consistently outperforms state-of-the-art methods in both accuracy and efficiency.
📝 Abstract
Log parsing is a critical step for automated log analysis in complex systems. Traditional heuristic-based methods offer high efficiency but are limited in accuracy due to overlooking semantic context. In contrast, recent LLM-based parsers improve accuracy via se mantic understanding but incur high latency from frequent model calls. To address this, we propose SCOPE, the first self-correcting online log parsing method that integrates the strengths of both heuristic and LLM-based paradigms. SCOPE introduces a novel bi-directional tree structure that enables efficient template match ing from both forward and reverse directions, resulting in a higher overall matching rate. Additionally, it adopts a two-stage syntactic semantic collaboration framework: a lightweight NLP model first utilizes part-of-speech (POS) information for syntax-based match ing, while the LLM is selectively invoked as a fallback to handle semantically complex cases when uncertainty remains. This design significantly reduces LLM API usage while maintaining high ac curacy, achieving a balance between efficiency and effectiveness. Extensive evaluations on diverse benchmark datasets show that SCOPE outperforms state-of-the-art methods in both accuracy and efficiency. The implementation and datasets are publicly released to facilitate further research.
Problem

Research questions and friction points this paper is trying to address.

log parsing
heuristic-based methods
LLM-based parsers
accuracy-efficiency trade-off
semantic context
Innovation

Methods, ideas, or system contributions that make the work stand out.

log parsing
self-correcting
syntactic-semantic collaboration
bidirectional tree
online parsing
D
Dongyi Fan
Zhejiang Sci-Tech University
S
Suqiong Zhang
Zhejiang Sci-Tech University
Lili He
Lili He
CCHMC: Cincinnati Children's Hospital Medical Center
Deep LearningMedical Image Analysis
M
Ming Liu
Zhejiang Sci-Tech University
Y
Yifan Huo
Zhejiang Sci-Tech University