DeLog: An Efficient Log Compression Framework with Pattern Signature Synthesis

📅 2026-01-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a novel pattern-driven log compression framework that challenges the conventional reliance on parsing accuracy for effective compression. Traditional parser-based approaches exhibit limited performance on complex production logs, and high parsing precision does not necessarily translate to high compression ratios. The study reveals that the key to superior compression lies in generating low-entropy, highly compressible log groups rather than achieving accurate log parsing. To this end, the authors introduce a pattern signature synthesis mechanism that enables efficient log grouping and encoding, significantly enhancing compression efficiency. Extensive experiments demonstrate that the proposed method consistently achieves state-of-the-art compression ratios and speeds across 16 public datasets and 10 real-world production logs.

Technology Category

Application Category

📝 Abstract
Parser-based log compression, which separates static templates from dynamic variables, is a promising approach to exploit the unique structure of log data. However, its performance on complex production logs is often unsatisfactory. This performance gap coincides with a known degradation in the accuracy of its core log parsing component on such data, motivating our investigation into a foundational yet unverified question: does higher parsing accuracy necessarily lead to better compression ratio? To answer this, we conduct the first empirical study quantifying this relationship and find that a higher parsing accuracy does not guarantee a better compression ratio. Instead, our findings reveal that compression ratio is dictated by achieving effective pattern-based grouping and encoding, i.e., the partitioning of tokens into low entropy, highly compressible groups. Guided by this insight, we design DeLog, a novel log compressor that implements a Pattern Signature Synthesis mechanism to achieve efficient pattern-based grouping. On 16 public and 10 production datasets, DeLog achieves state-of-the-art compression ratio and speed.
Problem

Research questions and friction points this paper is trying to address.

log compression
parsing accuracy
compression ratio
pattern-based grouping
log data
Innovation

Methods, ideas, or system contributions that make the work stand out.

log compression
pattern signature synthesis
parsing accuracy
entropy-based grouping
template-variable separation
🔎 Similar Papers
No similar papers found.