DecodeX: Exploring and Benchmarking of LDPC Decoding across CPU, GPU, and ASIC Platforms

📅 2025-11-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The absence of a unified evaluation benchmark for LDPC decoding acceleration across heterogeneous platforms (CPU/GPU/ASIC) in virtualized RAN (vRAN) hinders fair performance comparison and co-design. Method: This paper proposes DecodeX, a standardized LDPC decoding benchmark framework tailored for baseband processing. It integrates representative decoding kernels—FlexRAN (CPU), Aerial and Sionna-RK (GPU), and ACC100 (ASIC)—under a unified API and comprehensive multi-scenario test vectors, enabling systematic quantification of computational scheduling, memory access patterns, and data movement overheads on decoding latency. Contribution/Results: DecodeX uncovers, for the first time, the fundamental trade-off between parallel efficiency and offloading overhead, revealing that acceleration gains are fundamentally bounded by workload granularity and cross-layer data-transfer bottlenecks. Experimental evaluation demonstrates its effectiveness in assessing energy efficiency, scalability, and platform portability, providing empirical foundations and methodological support for heterogeneous vRAN co-design and adaptive runtime scheduling.

Technology Category

Application Category

📝 Abstract
Emerging virtualized radio access networks (vRANs) demand flexible and efficient baseband processing across heterogeneous compute substrates. In this paper, we present DecodeX, a unified benchmarking framework for evaluating low-density parity-check (LDPC) decoding acceleration across different hardware platforms. DecodeX integrates a comprehensive suite of LDPC decoder implementations, including kernels, APIs, and test vectors for CPUs (FlexRAN), GPUs (Aerial and Sionna-RK), and ASIC (ACC100), and can be readily extended to additional architectures and configurations. Using DecodeX, we systematically characterize how different platforms orchestrate computation-from threading and memory management to data movement and accelerator offload-and quantify the resulting decoding latency under varying Physical layer parameters. Our observations reveal distinct trade-offs in parallel efficiency and offload overhead, showing that accelerator gains strongly depend on data-movement and workload granularity. Building on these insights, we discuss how cross-platform benchmarking can inform adaptive scheduling and co-design for future heterogeneous vRANs, enabling scalable and energy-efficient baseband processing for NextG wireless systems.
Problem

Research questions and friction points this paper is trying to address.

Benchmarking LDPC decoding across CPU GPU ASIC platforms
Characterizing computation orchestration and latency trade-offs
Enabling adaptive scheduling for heterogeneous vRAN baseband processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified benchmarking framework for LDPC decoding
Integrates decoder implementations across CPU GPU ASIC
Characterizes computation orchestration and decoding latency
🔎 Similar Papers
No similar papers found.
Zhenzhou Qi
Zhenzhou Qi
Duke University
vRANHeterogeneous ComputingWireless Network SystemComputer Network System
Y
Yuncheng Yao
Department of Computer Science, Duke University
Y
Yiming Li
Department of Electrical and Computer Engineering, Duke University
C
Chung-Hsuan Tung
Department of Electrical and Computer Engineering, Duke University
J
Junyao Zheng
Department of Electrical and Computer Engineering, Duke University
Danyang Zhuo
Danyang Zhuo
Duke University
Distributed SystemsNetworkingOperating Systems
Tingjun Chen
Tingjun Chen
Nortel Networks Assistant Professor of Electrical and Computer Engineering, Duke University
Wireless NetworksOptical NetworksMobile ComputingIoTTestbeds