🤖 AI Summary
The absence of a unified evaluation benchmark for LDPC decoding acceleration across heterogeneous platforms (CPU/GPU/ASIC) in virtualized RAN (vRAN) hinders fair performance comparison and co-design.
Method: This paper proposes DecodeX, a standardized LDPC decoding benchmark framework tailored for baseband processing. It integrates representative decoding kernels—FlexRAN (CPU), Aerial and Sionna-RK (GPU), and ACC100 (ASIC)—under a unified API and comprehensive multi-scenario test vectors, enabling systematic quantification of computational scheduling, memory access patterns, and data movement overheads on decoding latency.
Contribution/Results: DecodeX uncovers, for the first time, the fundamental trade-off between parallel efficiency and offloading overhead, revealing that acceleration gains are fundamentally bounded by workload granularity and cross-layer data-transfer bottlenecks. Experimental evaluation demonstrates its effectiveness in assessing energy efficiency, scalability, and platform portability, providing empirical foundations and methodological support for heterogeneous vRAN co-design and adaptive runtime scheduling.
📝 Abstract
Emerging virtualized radio access networks (vRANs) demand flexible and efficient baseband processing across heterogeneous compute substrates. In this paper, we present DecodeX, a unified benchmarking framework for evaluating low-density parity-check (LDPC) decoding acceleration across different hardware platforms. DecodeX integrates a comprehensive suite of LDPC decoder implementations, including kernels, APIs, and test vectors for CPUs (FlexRAN), GPUs (Aerial and Sionna-RK), and ASIC (ACC100), and can be readily extended to additional architectures and configurations. Using DecodeX, we systematically characterize how different platforms orchestrate computation-from threading and memory management to data movement and accelerator offload-and quantify the resulting decoding latency under varying Physical layer parameters. Our observations reveal distinct trade-offs in parallel efficiency and offload overhead, showing that accelerator gains strongly depend on data-movement and workload granularity. Building on these insights, we discuss how cross-platform benchmarking can inform adaptive scheduling and co-design for future heterogeneous vRANs, enabling scalable and energy-efficient baseband processing for NextG wireless systems.