A Tree-Structured Two-Phase Commit Framework for OceanBase: Optimizing Scalability and Consistency

📅 2026-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the trade-off between consistency and performance in distributed databases under cross-partition transactions, where traditional two-phase commit (2PC) protocols suffer from high coordination overhead, significant latency, and complex recovery during dynamic partition migration. To overcome these limitations, the authors propose a tree-based 2PC framework tailored for OceanBase, which treats log streams as atomic commit units and organizes participants into a coordinator-rooted directed acyclic commit tree. The protocol recursively executes a tree-structured commit process and introduces uncertainty states such as prepare-unknown and trans-unknown to prevent consistency violations caused by lost context. Notably, the design eliminates the need for explicit participant lists and inherently supports dynamic partition migration. Experimental results demonstrate that the framework substantially reduces latency and bandwidth overhead, achieving transaction performance close to single-node levels while maintaining strong consistency and high scalability.

Technology Category

Application Category

📝 Abstract
Modern distributed databases face challenges in achieving transactional consistency across distributed partitions. Traditional two-phase commit (2PC) protocols incur high coordination overhead and latency, and require complex recovery for dynamic partition transfers. This paper introduces a novel tree-shaped 2PC framework for OceanBase that leverages single-machine log streams to address these challenges through three innovations. First, we propose log streams as atomic participants, replacing partition-level coordination. By treating each log stream as the commit unit, a transaction spanning $N$ co-located partitions interacts with one participant, reducing coordination overhead by orders of magnitude (e.g., 99 percent reduction for $N=100$). Second, we design a tree-shaped 2PC protocol with coordinator-rooted DAG topology that dynamically handles partition transfers by recursively constructing commit trees. When a partition migrates during a transaction, the protocol embeds migration contexts as leaf nodes, eliminating explicit participant list updates, resolving circular dependencies, and ensuring linearizable commits under topology changes. Third, we introduce prepare-unknown and trans-unknown states to prevent consistency violations when participants lose context. These states signal uncertainty during retries, avoiding erroneous aborts from so-called lying participants while isolating users from ambiguity. Experimental evaluation demonstrates performance approaching that of single-machine transactions, with reduced latency and bandwidth consumption, validating the framework's effectiveness for modern distributed databases.
Problem

Research questions and friction points this paper is trying to address.

distributed databases
transactional consistency
two-phase commit
partition migration
coordination overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

log stream
tree-structured 2PC
partition migration
linearizable commit
consistency states
🔎 Similar Papers
No similar papers found.
Quanqing Xu
Quanqing Xu
Ant Group
Cloud ComputingCloud StorageLarge-scale Hybrid Storage Systems
C
Chen Qian
OceanBase, Ant Group
C
Chuanhui Yang
OceanBase, Ant Group
F
Fanyu Kong
OceanBase, Ant Group
G
Guixiang Liu
OceanBase, Ant Group
F
Fusheng Han
OceanBase, Ant Group
Z
Zixiang Zhai
OceanBase, Ant Group