BS-tree: A gapped data-parallel B-tree

📅 2025-05-02

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

This paper addresses the trade-off between SIMD acceleration and memory efficiency in in-memory B+ trees, proposing ParB+Tree—a compact in-memory index supporting branchless SIMD-parallel search and updates. Its core innovations include: (1) a novel intra-node gap-and-key redundancy co-design that enables fully branchless SIMD comparisons along the entire traversal path; and (2) dynamic capacity-aware Frame-of-Reference (FOR) compression coupled with memory-aligned block layout, jointly reducing storage overhead while sustaining high throughput. Experimental evaluation shows that under single- and multi-threaded mixed workloads, ParB+Tree achieves up to 3.2× lower query latency than state-of-the-art in-memory indexes (e.g., ART, FAST) and reduces memory footprint by 47%. Moreover, it outperforms existing learned indexes, delivering superior balance among speed, space efficiency, and scalability.

Technology Category

Application Category

📝 Abstract

We propose BS-tree, an in-memory implementation of the B+-tree that adopts the structure of the disk-based index (i.e., a balanced, multiway tree), setting the node size to a memory block that can be processed fast and in parallel using SIMD instructions. A novel feature of the BS-tree is that it enables gaps (unused positions) within nodes by duplicating key values. This allows (i) branchless SIMD search within each node, and (ii) branchless update operations in nodes without key shifting. We implement a frame of reference (FOR) compression mechanism, which allows nodes to have varying capacities, and can greatly decrease the memory footprint of BS-tree. We compare our approach to existing main-memory indices and learned indices under different workloads of queries and updates and demonstrate its robustness and superiority compared to previous work in single- and multi-threaded processing.

Problem

Research questions and friction points this paper is trying to address.

Efficient in-memory B+-tree with SIMD parallelism

Branchless node operations via key duplication

Compressed variable-capacity nodes for memory savings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gapped B+-tree with SIMD-optimized parallel processing

Branchless search and updates via key duplication

Frame of Reference compression for variable node capacities

🔎 Similar Papers

The Ubiquitous Skiplist: A Survey of What Cannot be Skipped About the Skiplist and its Applications in Big Data Systems