π€ AI Summary
This work addresses the performance bottleneck in CPU-based OLTP systems caused by frequent DRAM accesses due to cold locks, which severely degrade lock service efficiency. To overcome this limitation, the authors propose an FPGA-based hardware acceleration architecture that eliminates reliance on main memory for lock operations by integrating an on-chip lock table, a customized low-latency lock agent, and a full-lifecycle transaction execution engine. This design enables highly efficient lock acquisition and release while supporting a scalable transaction processing pipeline. Evaluation using the TPC-C benchmark demonstrates that the proposed system achieves up to a 51Γ throughput improvement over a pure CPU baseline, strongly validating the effectiveness of hardware-software co-optimization for OLTP workloads.
π Abstract
Online Transaction Processing (OLTP) is a classic application with a growing business. CPU-based OLTP has low lock serving efficiency. The main reason is that most locks are cold, and the lock agent must issue frequent memory accesses to retrieve the lock details to determine whether to grant it. This motivates us to propose dedicated hardware-based lock agents with integrated lock tables to remove the DRAM access overhead.
In this paper, we propose hardware-accelerated lock management and transaction processing for database systems. First, we propose a low-latency lock agent optimized for both lock acquiring and releasing requests. Second, we design a scalable transaction agent that executes the full transaction lifecycle. We present the architecture, optimizations, and design-space exploration of the proposed lock management and transaction processing system. The experiment results show up to 51X higher transaction throughput over the CPU baseline on the TPC-C benchmark.