DiFache: Efficient and Scalable Caching on Disaggregated Memory using Decentralized Coherence

📅 2025-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address high cache coherence overhead and poor scalability in disaggregated memory (DM) architectures, this paper proposes DiFache, a decentralized cache framework. Its core contributions are: (1) aligning the coherence model with memory nodes—not CPU caches—by leveraging the inherent serialization of application-level remote memory accesses; (2) designing a coordinator-free invalidation protocol enabling independent operation of remote compute nodes; and (3) introducing a fine-grained, adaptive object caching policy that dynamically adjusts cache granularity based on observed read-to-write ratios. DiFache tightly integrates RDMA primitives with DM semantics. Evaluated on 54 real-world Twitter trace workloads, it achieves an average speedup of 5.53× (up to 10.83×). When integrated into two production DM applications, it improves peak throughput by 7.94× and 2.19×, respectively.

Technology Category

Application Category

📝 Abstract
The disaggregated memory (DM) architecture offers high resource elasticity at the cost of data access performance. While caching frequently accessed data in compute nodes (CNs) reduces access overhead, it requires costly centralized maintenance of cache coherence across CNs. This paper presents DiFache, an efficient, scalable, and coherent CN-side caching framework for DM applications. Observing that DM applications already serialize conflicting remote data access internally rather than relying on the cache layer, DiFache introduces decentralized coherence that aligns its consistency model with memory nodes instead of CPU caches, thereby eliminating the need for centralized management. DiFache features a decentralized invalidation mechanism to independently invalidate caches on remote CNs and a fine-grained adaptive scheme to cache objects with varying read-write ratios. Evaluations using 54 real-world traces from Twitter show that DiFache outperforms existing approaches by up to 10.83$ imes$ (5.53$ imes$ on average). By integrating DiFache, the peak throughput of two real-world DM applications increases by 7.94$ imes$ and 2.19$ imes$, respectively.
Problem

Research questions and friction points this paper is trying to address.

Reduces access overhead in disaggregated memory via CN-side caching
Eliminates centralized cache coherence maintenance for scalability
Optimizes caching for varying read-write ratios adaptively
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decentralized coherence for DM caching
Decentralized invalidation on remote CNs
Fine-grained adaptive caching scheme
🔎 Similar Papers
No similar papers found.
H
Hanze Zhang
Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University
K
Kaiming Wang
Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University
R
Rong Chen
Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University
Xingda Wei
Xingda Wei
Shanghai Jiao Tong University
System for AIDistributed systemOperating system
H
Haibo Chen
Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University