Bridging Cache-Friendliness and Concurrency: A Locality-Optimized In-Memory B-Skiplist

📅 2025-07-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional skip lists suffer from poor cache locality due to their single-element node design, limiting in-memory indexing performance. This paper proposes B-skiplist—a novel in-memory index structure that integrates the multi-element node organization of B-trees with the lightweight concurrency control of skip lists. Its core innovations include a top-down, single-pass insertion algorithm, compact multi-element nodes, and a lock-free concurrent execution model—preserving skip list simplicity while substantially improving cache efficiency. Experimental evaluation on a 128-thread system shows that B-skiplist achieves 2–9× higher throughput than Folly’s skip list and Java’s ConcurrentSkipListMap, reduces 99%-ile latency by 3.5–103×, and delivers point-query performance approaching that of optimal tree-based indexes. B-skiplist thus effectively addresses the long-standing cache-unfriendliness of highly concurrent skip lists.

Technology Category

Application Category

📝 Abstract
Skiplists are widely used for in-memory indexing in many key-value stores, such as RocksDB and LevelDB, due to their ease of implementation and simple concurrency control mechanisms. However, traditional skiplists suffer from poor cache locality, as they store only a single element per node, leaving performance on the table. Minimizing last-level cache misses is key to maximizing in-memory index performance, making high cache locality essential. In this paper, we present a practical concurrent B-skiplist that enhances cache locality and performance while preserving the simplicity of traditional skiplist structures and concurrency control schemes. Our key contributions include a top-down, single-pass insertion algorithm for B-skiplists and a corresponding simple and efficient top-down concurrency control scheme. On 128 threads, the proposed concurrent B-skiplist achieves between 2x-9x higher throughput compared to state-of-the-art concurrent skiplist implementations, including Facebook's concurrent skiplist from Folly and the Java ConcurrentSkipListMap. Furthermore, we find that the B-skiplist achieves competitive (0.9x-1.7x) throughput on point workloads compared to state-of-the-art cache-optimized tree-based indices (e.g., Masstree). For a more complete picture of the performance, we also measure the latency of skiplist and tree-based indices and find that the B-skiplist achieves between 3.5x-103x lower 99% latency compared to other concurrent skiplists and between 0.85x-64x lower 99% latency compared to tree-based indices on point workloads with inserts.
Problem

Research questions and friction points this paper is trying to address.

Improving cache locality in concurrent skiplists
Enhancing performance while maintaining simplicity
Reducing latency in in-memory indexing structures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Locality-optimized concurrent B-skiplist design
Top-down single-pass insertion algorithm
Efficient top-down concurrency control scheme
Y
Yicong Luo
Georgia Institute of Technology, Atlanta, USA
S
Senhe Hao
Georgia Institute of Technology, Atlanta, USA
B
Brian Wheatman
University of Chicago, Chicago, USA
P
Prashant Pandey
Northeastern University, Boston, USA
Helen Xu
Helen Xu
Georgia Institute of Technology
Parallel AlgorithmsCachingPerformance Engineering