Practical Parallel Block Tree Construction: First Results

📅 2025-12-29

📈 Citations: 0

✨ Influential: 0

career value

254K/year

🤖 AI Summary

Block trees, as compressed text indexes, offer excellent query performance but suffer from slow construction and high memory overhead—severely limiting practical deployment. To address this, we propose the first lightweight parallel block tree construction framework. Our method introduces a data-parallel Karp–Rabin fingerprinting scheme, synergistically integrating Lempel–Ziv compression structure awareness with multicore CPU optimizations. On a 64-core platform, it achieves near-linear speedup, constructing block trees four times faster than the state-of-the-art sequential algorithm while reducing peak memory consumption by an order of magnitude (10×). Crucially, it preserves the original space efficiency and query performance guarantees. This work marks the first simultaneous breakthrough in construction speed and memory footprint for block trees, establishing a new paradigm for efficient large-scale deployment of compressed text indexes.

Technology Category

Application Category

📝 Abstract

The block tree [Belazzougui et al., J. Comput. Syst. Sci. '21] is a compressed representation of a length-$n$ text that supports access, rank, and select queries while requiring only $O(zlogfrac{n}{z})$ words of space, where $z$ is the number of Lempel-Ziv factors of the text. In other words, its space-requirements are asymptotically similar to those of the compressed text. In practice, block trees offer comparable query performance to state-of-the-art compressed rank and select indices. However, their construction is significantly slower. Additionally, the fastest construction algorithms require a significant amount of working memory. To address this issue, we propose fast and lightweight parallel algorithms for the efficient construction of block trees. Our algorithm achieves similar speed than the currently fastest construction algorithm on one core and is up to four times faster using 64 cores. It achieves all that while requiring an order of magnitude less memory. As result of independent interest, we present a data parallel algorithm for Karp-Rabin fingerprint computation.

Problem

Research questions and friction points this paper is trying to address.

Block tree construction is slow and memory-intensive

Propose parallel algorithms for faster block tree building

Reduce memory usage while maintaining query performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel algorithms for block tree construction

Reduced memory usage by an order of magnitude

Data parallel Karp-Rabin fingerprint computation

🔎 Similar Papers

No similar papers found.