Performance Characterizations and Usage Guidelines of Samsung CXL Memory Module Hybrid Prototype

📅 2025-03-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high cost, limited capacity, and volatility of DRAM, this work presents the first system-level performance characterization and application adaptation study of Samsung’s CXL memory module hybrid prototype (CMM-H)—integrating DRAM and NAND flash—on an FPGA platform, targeting data-intensive workloads such as AI/ML and HPC. We propose a memory-semantics-based deployment scheme for NAND-backed memory, bypassing traditional block-device I/O stack overheads, and develop a multi-tiered microbenchmark and real-application evaluation framework. Experimental results show that CMM-H achieves over 90% of DRAM performance across mainstream AI, HPC, and database workloads, with average memory access latency under 1.5 μs. We further delineate its applicability boundaries and identify key tuning strategies for optimal deployment. This study provides the first comprehensive empirical validation and practical engineering guidance for deploying CXL-based hybrid memory systems.

Technology Category

Application Category

📝 Abstract
The growing prevalence of data-intensive workloads, such as artificial intelligence (AI), machine learning (ML), high-performance computing (HPC), in-memory databases, and real-time analytics, has exposed limitations in conventional memory technologies like DRAM. While DRAM offers low latency and high throughput, it is constrained by high costs, scalability challenges, and volatility, making it less viable for capacity-bound and persistent applications in modern datacenters. Recently, Compute Express Link (CXL) has emerged as a promising alternative, enabling high-speed, cacheline-granular communication between CPUs and external devices. By leveraging CXL technology, NAND flash can now be used as memory expansion, offering three-fold benefits: byte-addressability, scalable capacity, and persistence at a low cost. Samsung's CXL Memory Module Hybrid (CMM-H) is the first product to deliver these benefits through a hardware-only solution, i.e., it does not incur any OS and IO overheads like conventional block devices. In particular, CMM-H integrates a DRAM cache with NAND flash in a single device to deliver near-DRAM latency. This paper presents the first publicly available study for comprehensive characterizations of an FPGA-based CMM-H prototype. Through this study, we address users' concerns about whether a wide variety of applications can successfully run on a memory device backed by NAND flash medium. Additionally, based on these characterizations, we provide key insights into how to best take advantage of the CMM-H device.
Problem

Research questions and friction points this paper is trying to address.

Evaluating Samsung CXL Memory Module Hybrid for data-intensive workloads
Assessing NAND flash as scalable, persistent memory alternative
Providing usage guidelines for optimal CMM-H performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

CXL technology enables high-speed CPU-device communication
Hybrid DRAM-NAND flash for near-DRAM latency
Hardware-only solution avoids OS and IO overheads
🔎 Similar Papers
No similar papers found.
Jianping Zeng
Jianping Zeng
Assistant Professor of Computer Science and Engineering at Arizona State University
Computer ArchitectureCompilers
Shuyi Pei
Shuyi Pei
Samsung Semiconductor, Inc.
D
Da Zhang
Samsung Semiconductor, USA
Y
Yuchen Zhou
Purdue University, USA
A
Amir Beygi
Samsung Semiconductor, USA
X
Xuebin Yao
Samsung Semiconductor, USA
R
Ramdas Kachare
Samsung Semiconductor, USA
T
Tong Zhang
Samsung Semiconductor, USA
Z
Zongwang Li
Samsung Semiconductor, USA
M
Marie Nguyen
Samsung Semiconductor, USA
R
Rekha Pitchumani
Samsung Semiconductor, USA
Y
Yang Soek Ki
Samsung Semiconductor, USA
Changhee Jung
Changhee Jung
Samuel D. Conte Associate Professor of Computer Science, Purdue University
CompilersComputer ArchitectureRuntime Systems