🤖 AI Summary
In disaggregated memory (DM) architectures, transactional systems suffer from RDMA NIC bottlenecks on memory nodes due to frequent one-sided atomic operations. To address this, we propose Lock-Decoupled Transactions (LDT), which fully offloads lock management to compute nodes, eliminating atomic operation pressure on memory nodes. Our key contributions are: (1) the first lock-decoupled architecture for DM; (2) an application-aware, locality-driven dynamic lock sharding mechanism; (3) a lock-prioritized two-phase commit protocol; and (4) lightweight, rebuild-free fault recovery leveraging ephemeral locks. Evaluation shows that LDT achieves 2.1× higher throughput and reduces tail latency by 49.4% compared to the state-of-the-art system, significantly improving scalability and responsiveness of distributed transactions in DM environments.
📝 Abstract
Disaggregated memory (DM) separates compute and memory resources, allowing flexible scaling to achieve high resource utilization. To ensure atomic and consistent data access on DM, distributed transaction systems have been adapted, where compute nodes (CNs) rely on one-sided RDMA operations to access remote data in memory nodes (MNs). However, we observe that in existing transaction systems, the RDMA network interface cards at MNs become a primary performance bottleneck. This bottleneck arises from the high volume of one-sided atomic operations used for locks, which hinders the system's ability to scale efficiently.
To address this issue, this paper presents Lotus, a scalable distributed transaction system with lock disaggregation on DM. The key innovation of Lotus is to disaggregate locks from data and execute all locks on CNs, thus eliminating the bottleneck at MN RNICs. To achieve efficient lock management on CNs, Lotus employs an application-aware lock management mechanism that leverages the locality of the OLTP workloads to shard locks while maintaining load balance. To ensure consistent transaction processing with lock disaggregation, Lotus introduces a lock-first transaction protocol, which separates the locking phase as the first step in each read-write transaction execution. This protocol allows the system to determine the success of lock acquisitions early and proactively abort conflicting transactions, improving overall efficiency. To tolerate lock loss during CN failures, Lotus employs a lock-rebuild-free recovery mechanism that treats locks as ephemeral and avoids their reconstruction, ensuring lightweight recovery for CN failures. Experimental results demonstrate that Lotus improves transaction throughput by up to 2.1$ imes$ and reduces latency by up to 49.4% compared to state-of-the-art transaction systems on DM.