🤖 AI Summary
In remote memory systems, on-demand data loading from remote nodes incurs substantial access latency, constituting a critical performance bottleneck. This paper proposes Memix, a deep learning–system co-designed prefetching framework that decouples application semantics from runtime context for independent modeling—training specialized deep neural networks to accurately predict memory access patterns. Memix integrates these predictions with system-level cache management to enable hardware-software co-optimized, dynamic prefetching decisions. This design significantly improves both prefetching accuracy and timeliness. Evaluated on representative data-intensive workloads, Memix achieves up to 42% end-to-end performance improvement over the state-of-the-art remote memory systems, while reducing remote memory access latency by up to 37%.
📝 Abstract
Far-memory systems, where applications store less-active data in more energy-efficient memory media, are increasingly adopted by data centers. However, applications are bottlenecked by on-demand data fetching from far- to local-memory. We present Memix, a far-memory system that embodies a deep-learning-system co-design for efficient and accurate prefetching, minimizing on-demand far-memory accesses. One key observation is that memory accesses are shaped by both application semantics and runtime context, providing an opportunity to optimize each independently. Preliminary evaluation of Memix on data-intensive workloads shows that it outperforms the state-of-the-art far-memory system by up to 42%.