ABase: the Multi-Tenant NoSQL Serverless Database for Diverse and Dynamic Workloads in Large-scale Cloud Environments

📅 2025-05-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses three critical challenges in multi-tenant, serverless NoSQL databases for large-scale cloud environments under dynamic, heterogeneous workloads: (1) cache-induced performance isolation breakdown, (2) sluggish responsiveness to traffic surges, and (3) imbalanced resource allocation across data nodes. To tackle these, the authors propose: (1) a novel two-tier caching architecture with cache-aware, fine-grained performance isolation; (2) a time-series forecasting–driven auto-scaling policy for rapid elasticity; and (3) a multidimensional rescheduling algorithm jointly optimizing CPU, memory, and I/O resources. Deployed in ByteDance’s production infrastructure—the world’s largest serverless NoSQL deployment—the system supports peak query throughput exceeding 13 billion QPS and total storage over 1 exabyte. Evaluation results demonstrate strong performance isolation, 57% reduction in service jitter, and over 35% improvement in aggregate resource utilization.

Technology Category

Application Category

📝 Abstract
Multi-tenant architectures enhance the elasticity and resource utilization of NoSQL databases by allowing multiple tenants to co-locate and share resources. However, in large-scale cloud environments, the diverse and dynamic nature of workloads poses significant challenges for multi-tenant NoSQL databases. Based on our practical observations, we have identified three crucial challenges: (1) the impact of caching on performance isolation, as cache hits alter request execution and resource consumption, leading to inaccurate traffic control; (2) the dynamic changes in traffic, with changes in tenant traffic trends causing throttling or resource wastage, and changes in access distribution causing hot key pressure or cache hit ratio drops; and (3) the imbalanced layout of data nodes due to tenants' diverse resource requirements, leading to low resource utilization. To address these challenges, we introduce ABase, a multi-tenant NoSQL serverless database developed at ByteDance. ABase introduces a two-layer caching mechanism with a cache-aware isolation mechanism to ensure accurate resource consumption estimates. Furthermore, ABase employs a predictive autoscaling policy to dynamically adjust resources in response to tenant traffic changes and a multi-resource rescheduling algorithm to balance resource utilization across data nodes. With these innovations, ABase has successfully served ByteDance's large-scale cloud environment, supporting a total workload that has achieved a peak QPS of over 13 billion and total storage exceeding 1 EB.
Problem

Research questions and friction points this paper is trying to address.

Impact of caching on performance isolation in multi-tenant NoSQL databases
Dynamic traffic changes causing throttling or resource wastage
Imbalanced data node layout due to diverse resource requirements
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-layer caching with cache-aware isolation
Predictive autoscaling for dynamic traffic
Multi-resource rescheduling for balanced utilization
R
Rong Kang
ByteDance Inc.
Y
Yanbin Chen
ByteDance Inc.
Y
Ye Liu
Bytedance Inc.
Fuxin Jiang
Fuxin Jiang
ByteDance
TimeSeries ForecastingResource SchedulingLLM
Q
Qingshuo Li
ByteDance Inc.
M
Miao Ma
ByteDance Inc.
J
Jian Liu
ByteDance Inc.
G
Guangliang Zhao
ByteDance Inc.
Tieying Zhang
Tieying Zhang
Research Scientist at Bytedance
AI for SystemsSystems for AI
J
Jianjun Chen
Bytedance Inc.
L
Lei Zhang
ByteDance Inc.