Lempel-Ziv Complexity, Empirical Entropies, and Chain Rules

📅 2025-06-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates upper and lower bounds on the overall compression ratio achieved by the LZ78 algorithm when applied to finite individual sequences under $k$-block segmentation, and establishes quantitative relationships between these bounds and the normalized empirical entropy. Methodologically, it introduces and rigorously proves a chain decomposition rule for LZ complexity—enabling provable decomposition of joint complexity into conditional complexities—and integrates LZ78 compression analysis, individual sequence complexity theory, and information-theoretic inequalities to derive tight compression-rate bounds. Key contributions include: (i) identifying a critical matching condition between block length $k$ and the order of empirical entropy; (ii) establishing an LZ chain decomposition theorem with an explicit, vanishing error term; and (iii) deriving novel compressibility criteria for sequence structure, thereby providing both theoretical foundations and practical bounds for LZ-based sequence analysis.

Technology Category

Application Category

📝 Abstract
We derive upper and lower bounds on the overall compression ratio of the 1978 Lempel-Ziv (LZ78) algorithm, applied independently to $k$-blocks of a finite individual sequence. Both bounds are given in terms of normalized empirical entropies of the given sequence. For the bounds to be tight and meaningful, the order of the empirical entropy should be small relative to $k$ in the upper bound, but large relative to $k$ in the lower bound. Several non-trivial conclusions arise from these bounds. One of them is a certain form of a chain rule of the Lempel-Ziv (LZ) complexity, which decomposes the joint LZ complexity of two sequences, say, $x$ and $y$, into the sum of the LZ complexity of $x$ and the conditional LZ complexity of $y$ given $x$ (up to small terms). The price of this decomposition, however, is in changing the length of the block. Additional conclusions are discussed as well.
Problem

Research questions and friction points this paper is trying to address.

Analyze compression bounds of LZ78 on finite sequences
Relate bounds to empirical entropies with order constraints
Derive chain rule for LZ complexity decomposition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bounds on LZ78 compression ratio
Empirical entropies for tight bounds
Chain rule for LZ complexity decomposition
🔎 Similar Papers
No similar papers found.