🤖 AI Summary
Cloud-native databases have achieved storage-compute separation, yet their control planes still rely on external lightweight coordination services (e.g., ZooKeeper), causing scalability bottlenecks, high operational costs, and management complexity. This paper introduces, for the first time, the storage-decoupling paradigm to the cluster coordination layer, proposing an embedded transactional coordination architecture that eliminates dependence on external coordinators. Our approach features a cross-node unified state management mechanism and MarlinCommit—a novel distributed commit protocol—ensuring strong consistency. Crucially, it achieves full decoupling among control, compute, and storage planes. Experimental evaluation demonstrates up to 4.4× improvement in cost efficiency and up to 4.9× reduction in reconfiguration latency compared to conventional approaches, significantly enhancing system elasticity and operational effectiveness.
📝 Abstract
Modern cloud databases are shifting from converged architectures to storage disaggregation, enabling independent scaling and billing of compute and storage. However, cloud databases still rely on external, converged coordination services (e.g., ZooKeeper) for their control planes. These services are effectively lightweight databases optimized for low-volume metadata. As the control plane scales in the cloud, this approach faces similar limitations as converged databases did before storage disaggregation: scalability bottlenecks, low cost efficiency, and increased operational burden.
We propose to disaggregate the cluster coordination to achieve the same benefits that storage disaggregation brought to modern cloud DBMSs. We present Marlin, a cloud-native coordination mechanism that fully embraces storage disaggregation. Marlin eliminates the need for external coordination services by consolidating coordination functionality into the existing cloud-native database it manages. To achieve failover without an external coordination service, Marlin allows cross-node modifications on coordination states. To ensure data consistency, Marlin employs transactions to manage both coordination and application states and introduces MarlinCommit, an optimized commit protocol that ensures strong transactional guarantees even under cross-node modifications. Our evaluations demonstrate that Marlin improves cost efficiency by up to 4.4x and reduces reconfiguration duration by up to 4.9x compared to converged coordination solutions.