Scaling atomic ordering in shared memory

📅 2025-09-09

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Existing shared-memory systems lack efficient atomic multicast protocols, hindering the development of fault-tolerant and scalable distributed services. To address this, we propose TRAM—the first high-performance, decentralized atomic multicast protocol natively designed for shared memory. TRAM innovatively integrates an overlay tree topology with a lightweight synchronization mechanism, fully leveraging shared memory’s low latency and high bandwidth to guarantee strongly consistent, ordered message delivery. Experimental evaluation demonstrates that TRAM achieves up to 3.1× higher throughput and 2.4× lower end-to-end latency compared to state-of-the-art shared-memory multicast approaches. Against the best-performing message-passing protocols, TRAM delivers up to 5.9× higher throughput and reduces latency by up to 106×. Crucially, TRAM is the first atomic multicast protocol for the shared-memory model to simultaneously achieve high throughput, ultra-low latency, and linear scalability.

Technology Category

Application Category

📝 Abstract

Atomic multicast is a communication primitive used in dependable systems to ensure consistent ordering of messages delivered to a set of replica groups. This primitive enables critical services to integrate replication and sharding (i.e., state partitioning) to achieve fault tolerance and scalability. While several atomic multicast protocols have been developed for message-passing systems, only a few are designed for the shared memory system model. This paper introduces TRAM, an atomic multicast protocol specifically designed for shared memory systems, leveraging an overlay tree architecture. Due to its simple and practical design, TRAM delivers exceptional performance, increasing throughput by more than 3$ imes$ and reducing latency by more than 2.3$ imes$ compared to state-of-the-art shared memory-based protocols. Additionally, it significantly outperforms message-passing-based protocols, boosting throughput by up to 5.9$ imes$ and reducing latency by up to 106$ imes$.

Problem

Research questions and friction points this paper is trying to address.

Ensuring consistent message ordering in shared memory systems

Improving atomic multicast protocol performance and scalability

Reducing latency and increasing throughput for replica groups

Innovation

Methods, ideas, or system contributions that make the work stand out.

TRAM protocol for shared memory systems

Overlay tree architecture for atomic multicast

High throughput and low latency performance

🔎 Similar Papers

Cache Coherence Over Disaggregated Memory