An Early Exploration of Deep-Learning-Driven Prefetching for Far Memory

📅 2025-10-05

📈 Citations: 0

✨ Influential: 0

career value

248K/year

🤖 AI Summary

In remote memory systems, on-demand data loading from remote nodes incurs substantial access latency, constituting a critical performance bottleneck. This paper proposes Memix, a deep learning–system co-designed prefetching framework that decouples application semantics from runtime context for independent modeling—training specialized deep neural networks to accurately predict memory access patterns. Memix integrates these predictions with system-level cache management to enable hardware-software co-optimized, dynamic prefetching decisions. This design significantly improves both prefetching accuracy and timeliness. Evaluated on representative data-intensive workloads, Memix achieves up to 42% end-to-end performance improvement over the state-of-the-art remote memory systems, while reducing remote memory access latency by up to 37%.

Technology Category

Application Category

📝 Abstract

Far-memory systems, where applications store less-active data in more energy-efficient memory media, are increasingly adopted by data centers. However, applications are bottlenecked by on-demand data fetching from far- to local-memory. We present Memix, a far-memory system that embodies a deep-learning-system co-design for efficient and accurate prefetching, minimizing on-demand far-memory accesses. One key observation is that memory accesses are shaped by both application semantics and runtime context, providing an opportunity to optimize each independently. Preliminary evaluation of Memix on data-intensive workloads shows that it outperforms the state-of-the-art far-memory system by up to 42%.

Problem

Research questions and friction points this paper is trying to address.

Addressing on-demand data fetching bottlenecks in far-memory systems

Optimizing prefetching using deep learning and system co-design

Reducing far-memory accesses by leveraging application semantics and runtime context

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep-learning-driven prefetching for far memory systems

Co-design of deep learning and system for accurate prefetching

Optimizing application semantics and runtime context independently

🔎 Similar Papers

PQCache: Product Quantization-based KVCache for Long Context LLM Inference