Lotus: Optimizing Disaggregated Transactions with Disaggregated Locks

📅 2025-12-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In disaggregated memory (DM) architectures, transactional systems suffer from RDMA NIC bottlenecks on memory nodes due to frequent one-sided atomic operations. To address this, we propose Lock-Decoupled Transactions (LDT), which fully offloads lock management to compute nodes, eliminating atomic operation pressure on memory nodes. Our key contributions are: (1) the first lock-decoupled architecture for DM; (2) an application-aware, locality-driven dynamic lock sharding mechanism; (3) a lock-prioritized two-phase commit protocol; and (4) lightweight, rebuild-free fault recovery leveraging ephemeral locks. Evaluation shows that LDT achieves 2.1× higher throughput and reduces tail latency by 49.4% compared to the state-of-the-art system, significantly improving scalability and responsiveness of distributed transactions in DM environments.

Technology Category

Application Category

📝 Abstract
Disaggregated memory (DM) separates compute and memory resources, allowing flexible scaling to achieve high resource utilization. To ensure atomic and consistent data access on DM, distributed transaction systems have been adapted, where compute nodes (CNs) rely on one-sided RDMA operations to access remote data in memory nodes (MNs). However, we observe that in existing transaction systems, the RDMA network interface cards at MNs become a primary performance bottleneck. This bottleneck arises from the high volume of one-sided atomic operations used for locks, which hinders the system's ability to scale efficiently. To address this issue, this paper presents Lotus, a scalable distributed transaction system with lock disaggregation on DM. The key innovation of Lotus is to disaggregate locks from data and execute all locks on CNs, thus eliminating the bottleneck at MN RNICs. To achieve efficient lock management on CNs, Lotus employs an application-aware lock management mechanism that leverages the locality of the OLTP workloads to shard locks while maintaining load balance. To ensure consistent transaction processing with lock disaggregation, Lotus introduces a lock-first transaction protocol, which separates the locking phase as the first step in each read-write transaction execution. This protocol allows the system to determine the success of lock acquisitions early and proactively abort conflicting transactions, improving overall efficiency. To tolerate lock loss during CN failures, Lotus employs a lock-rebuild-free recovery mechanism that treats locks as ephemeral and avoids their reconstruction, ensuring lightweight recovery for CN failures. Experimental results demonstrate that Lotus improves transaction throughput by up to 2.1$ imes$ and reduces latency by up to 49.4% compared to state-of-the-art transaction systems on DM.
Problem

Research questions and friction points this paper is trying to address.

Optimizes distributed transactions on disaggregated memory systems
Reduces RDMA network bottleneck by disaggregating locks to compute nodes
Enhances scalability and efficiency through lock-first protocol and lightweight recovery
Innovation

Methods, ideas, or system contributions that make the work stand out.

Disaggregates locks from data to CNs
Uses application-aware lock management mechanism
Implements lock-first transaction protocol for consistency
🔎 Similar Papers
Z
Zhisheng Hu
The Chinese University of Hong Kong
Pengfei Zuo
Pengfei Zuo
Huawei
AI InfrastructureCloud InfrastructureMachine Learning SystemsMemory SystemsStorage Systems
J
Junliang Hu
The Chinese University of Hong Kong
Y
Yizou Chen
The Chinese University of Hong Kong
Y
Yingjia Wang
The Chinese University of Hong Kong
Ming-Chang Yang
Ming-Chang Yang
Associate Professor, Department of Computer Science & Engineering at Chinese University of Hong
Non-Volatile MemoryMemory/Storage SystemsEmbedded SystemsComputer Systems