Practical Parallel Block Tree Construction: First Results

📅 2025-12-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Block trees, as compressed text indexes, offer excellent query performance but suffer from slow construction and high memory overhead—severely limiting practical deployment. To address this, we propose the first lightweight parallel block tree construction framework. Our method introduces a data-parallel Karp–Rabin fingerprinting scheme, synergistically integrating Lempel–Ziv compression structure awareness with multicore CPU optimizations. On a 64-core platform, it achieves near-linear speedup, constructing block trees four times faster than the state-of-the-art sequential algorithm while reducing peak memory consumption by an order of magnitude (10×). Crucially, it preserves the original space efficiency and query performance guarantees. This work marks the first simultaneous breakthrough in construction speed and memory footprint for block trees, establishing a new paradigm for efficient large-scale deployment of compressed text indexes.

Technology Category

Application Category

📝 Abstract
The block tree [Belazzougui et al., J. Comput. Syst. Sci. '21] is a compressed representation of a length-$n$ text that supports access, rank, and select queries while requiring only $O(zlogfrac{n}{z})$ words of space, where $z$ is the number of Lempel-Ziv factors of the text. In other words, its space-requirements are asymptotically similar to those of the compressed text. In practice, block trees offer comparable query performance to state-of-the-art compressed rank and select indices. However, their construction is significantly slower. Additionally, the fastest construction algorithms require a significant amount of working memory. To address this issue, we propose fast and lightweight parallel algorithms for the efficient construction of block trees. Our algorithm achieves similar speed than the currently fastest construction algorithm on one core and is up to four times faster using 64 cores. It achieves all that while requiring an order of magnitude less memory. As result of independent interest, we present a data parallel algorithm for Karp-Rabin fingerprint computation.
Problem

Research questions and friction points this paper is trying to address.

Block tree construction is slow and memory-intensive
Propose parallel algorithms for faster block tree building
Reduce memory usage while maintaining query performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel algorithms for block tree construction
Reduced memory usage by an order of magnitude
Data parallel Karp-Rabin fingerprint computation
🔎 Similar Papers
No similar papers found.
R
Robert Clausecker
Zuse Institute Berlin, Germany
Florian Kurpicz
Florian Kurpicz
University of Münster
Text IndicesCompressionScalable Data StructuresSuccinct Data StructuresAlgorithm Engineering
E
Etienne Palanga
TU Dortmund University, Germany