Swift: Rethinking RDMA Control Plane for Elastic Computing

📅 2025-01-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the bottlenecks of slow connection establishment and poor resource sharing in RDMA control planes under dynamic scaling in elastic computing, this paper proposes a lightweight co-optimization approach: (1) caching-optimized libibverbs to significantly accelerate both cold and warm starts; and (2) leveraging RDMA’s native fork mechanism to enable secure, efficient inter-process reuse of user-space RDMA resources. We are the first to demonstrate that user-space RDMA connections can be accelerated via caching and that RDMA resources can be shared across processes via fork—challenging the conventional wisdom that microsecond-level control-plane optimization is indispensable. Our solution is deeply integrated with a Serverless framework (OpenWhisk), enabling a redesigned user-space RDMA control plane. Experiments show that, compared to the baseline, our approach achieves 30.56–46.50% higher average throughput, reduces end-to-end latency by 18.55–37.21%, and incurs only 6.5% additional control overhead.

Technology Category

Application Category

📝 Abstract
Elastic computing enables dynamic scaling to meet workload demands, and Remote Direct Memory Access (RDMA) enhances this by providing high-throughput, low-latency network communication. However, integrating RDMA into elastic computing remains a challenge, particularly in control plane operations for RDMA connection setup. This paper revisits the assumptions of prior work on high-performance RDMA for elastic computing, and reveals that extreme microsecond-level control plane optimizations are often unnecessary. By challenging the conventional beliefs on the slowness of user-space RDMA control plane and the difficulty of user-space RDMA resource sharing, we uncover new design opportunities. Our key insight is that user-space RDMA connection setup can be significantly improved with caching, while RDMA resources can be efficiently shared among processes using fork. In light of this, we propose Swift, a simple yet effective solution that co-designs RDMA with a serverless framework to optimize performance for elastic computing. At its very core, Swift handles cold and warm serverless requests by swiftly initializing the RDMA control plane with cache-optimized libibverbs, and manages fork requests by leveraging the RDMA's fork capability. Implemented with OpenWhisk, Swift delivers 30.56-46.50% higher average throughput and 18.55-37.21% lower latency, at a cost of 6.5% control plane overhead, compared to prior solutions.
Problem

Research questions and friction points this paper is trying to address.

RDMA
elastic computing
control operation efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

RDMA optimization
fork function
elastic computing efficiency
🔎 Similar Papers
No similar papers found.
Junxue Zhang
Junxue Zhang
University of Science and Technology of China
Data Center NetworkingML SystemRDMA
Han Tian
Han Tian
University of Science and Technology of China
Machine learningnetworkingprivacy computing
X
Xinyang Huang
iSINGLab @ Hong Kong University of Science and Technology
Wenxue Li
Wenxue Li
Harbin Institute of Technology Weihai
Kaiqiang Xu
Kaiqiang Xu
PhD, AI and ML Systems
ML SystemsCloud ComputingComputer Networks
D
Dian Shen
Southeast University
Y
Yong Wang
iSINGLab @ Hong Kong University of Science and Technology
K
Kai Chen
iSINGLab @ Hong Kong University of Science and Technology