SCOPE: Tree-based Self-Correcting Online Log Parsing via Syntactic-Semantic Collaboration

📅 2026-03-28

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work proposes an online self-correcting log parsing approach that bridges the gap between efficiency and accuracy in log analysis. While traditional methods are computationally efficient yet limited in accuracy, large language models (LLMs) offer high precision at substantial computational cost. To address this trade-off, the proposed method introduces a novel bidirectional tree structure to enhance template matching rates and employs a two-stage syntax–semantics collaboration mechanism: a lightweight NLP model driven by part-of-speech tagging performs initial syntactic filtering, followed by selective invocation of an LLM for semantic refinement only when necessary. This design drastically reduces LLM usage while maintaining high parsing accuracy. Experimental results across multiple benchmark datasets demonstrate that the framework consistently outperforms state-of-the-art methods in both accuracy and efficiency.

Technology Category

Application Category

📝 Abstract

Log parsing is a critical step for automated log analysis in complex systems. Traditional heuristic-based methods offer high efficiency but are limited in accuracy due to overlooking semantic context. In contrast, recent LLM-based parsers improve accuracy via se mantic understanding but incur high latency from frequent model calls. To address this, we propose SCOPE, the first self-correcting online log parsing method that integrates the strengths of both heuristic and LLM-based paradigms. SCOPE introduces a novel bi-directional tree structure that enables efficient template match ing from both forward and reverse directions, resulting in a higher overall matching rate. Additionally, it adopts a two-stage syntactic semantic collaboration framework: a lightweight NLP model first utilizes part-of-speech (POS) information for syntax-based match ing, while the LLM is selectively invoked as a fallback to handle semantically complex cases when uncertainty remains. This design significantly reduces LLM API usage while maintaining high ac curacy, achieving a balance between efficiency and effectiveness. Extensive evaluations on diverse benchmark datasets show that SCOPE outperforms state-of-the-art methods in both accuracy and efficiency. The implementation and datasets are publicly released to facilitate further research.

Problem

Research questions and friction points this paper is trying to address.

log parsing

heuristic-based methods

LLM-based parsers

accuracy-efficiency trade-off

semantic context

Innovation

Methods, ideas, or system contributions that make the work stand out.

log parsing

self-correcting

syntactic-semantic collaboration