🤖 AI Summary
This paper addresses the problem of efficiently maintaining the LZ77 factorization of a dynamic text under arbitrary character-level edits—insertions, deletions, and substitutions—while supporting real-time updates of the LZ77 parsing length and random access to individual factors. Prior work was restricted to semi-dynamic settings (e.g., append-only or prepend-only operations); this is the first solution for fully dynamic LZ77 size maintenance. The proposed method introduces a novel data structure combining dynamic string indexing, block-based processing, and periodicity-aware structural analysis. Theoretically, it achieves Õ(n²⁄₃) amortized update time per edit—improving upon the previous best Õ(√n) bound for semi-dynamic settings—and supports polylogarithmic-time random access to LZ77 factors. This breakthrough overcomes a fundamental bottleneck in dynamic compressed indexing, enabling new applications in streaming text compression and dynamic pattern matching.
📝 Abstract
The Lempel-Ziv 77 (LZ77) factorization is a fundamental compression scheme widely used in text processing and data compression. While efficient static algorithms exist for computing LZ77, maintaining it dynamically remains a challenging problem. Recently, Bannai, Charalampopoulos, and Radoszewski introduced an algorithm that maintains the size of the LZ77 factorization of a dynamic text in $ ilde{O}(sqrt{n})$ per update. Their data structure works in the semi-dynamic model, where the only allowed updates are insertions at the end of the string or deletions from the start. In contrast, we present an algorithm that operates in a significantly more general setting of arbitrary edit operations. Our algorithm maintains the size of the LZ77 factorization of a string undergoing symbol substitutions, deletions, and insertions in $ ilde{O}(n^{2/3})$ time per update. Additionally, our data structure supports random access to the LZ77 factorization in polylogarithmic time, providing enhanced functionality for dynamic text processing.