FOCUS: Boosting Schema-aware Access for KV Stores via Hierarchical Data Management

📅 2025-05-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing key-value (KV) stores exhibit a semantic gap with structured applications such as NewSQL systems, forcing hierarchical data to be flattened into flat key-value pairs—causing severe I/O amplification and I/O fragmentation. To bridge this gap, we propose the first log-structured KV store supporting fine-grained hierarchical data organization and schema-aware access. Our approach introduces: (1) a hierarchical KV data model that natively represents nested structures and preserves schema semantics; and (2) an NVM-optimized log-structured engine integrating hierarchical key-space management with schema-aware write-path optimization and query routing—eliminating the need for flattening entirely. Evaluated under YCSB SQL workloads, our system achieves 2.1–5.9× higher throughput than state-of-the-art NVM-based KV stores, significantly improving structured data ingestion and retrieval efficiency.

Technology Category

Application Category

📝 Abstract
Persistent key-value (KV) stores are critical infrastructure for data-intensive applications. Leveraging high-performance Non-Volatile Memory (NVM) to enhance KV stores has gained traction. However, previous work has primarily focused on optimizing KV stores themselves, without adequately addressing their integration into applications. Consequently, existing applications, represented by NewSQL databases, still resort to a flat mapping approach, which simply maps structured records into flat KV pairs to use KV stores. Such semantic mismatch may cause significant I/O amplification and I/O splitting under production workloads, harming the performance. To this end, we propose FOCUS, a log-structured KV store optimized for fine-grained hierarchical data organization and schema-aware access. FOCUS introduces a hierarchical KV model to provide native support for upper-layer structured data. We implemented FOCUS from scratch. Experiments show that FOCUS can increase throughput by 2.1-5.9x compared to mainstream NVM-backed KV stores under YCSB SQL workloads.
Problem

Research questions and friction points this paper is trying to address.

Addresses semantic mismatch in KV stores for structured data
Reduces I/O amplification and splitting in production workloads
Enables schema-aware access via hierarchical data organization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical KV model for structured data
Log-structured KV store optimization
Schema-aware access for performance boost
🔎 Similar Papers
No similar papers found.
Z
Zhen Liu
University of Science and Technology of China
W
Wenzhe Zhu
University of Science and Technology of China
Yongkun Li
Yongkun Li
University of Science and Technology of China
Storage SystemMemory and File SystemKey-value SystemGraph System
Y
Yinlong Xu
University of Science and Technology of China