A Hierarchical Dataflow-Driven Heterogeneous Architecture for Wireless Baseband Processing

📅 2024-02-28
🏛️ Asia and South Pacific Design Automation Conference
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Wireless baseband processing (WBP) imposes stringent requirements—periodicity, pipeline parallelism, and high throughput—that conventional DSP/GPU architectures fail to meet due to their neglect of WBP’s inherent dataflow characteristics and uncontrollable NUMA memory latency. To address this, we propose a hierarchical dataflow-driven heterogeneous hardware architecture: (1) a novel multi-level dataflow model explicitly tailored to WBP signal chains; (2) a NUMA-aware “pack-and-send” memory access paradigm; and (3) fine-grained, tile-level heterogeneous unit scheduling coordinated by dataflow semantics. Evaluated on a 45-core prototype, our design achieves a 288 Mbps link throughput, delivering 2.0× and 2.3× higher normalized throughput than GPU and DSP baselines, respectively, while reducing per-tile clock cycles by 2.3×. This work presents the first systematic co-optimization framework integrating dataflow modeling and architectural design for WBP.

Technology Category

Application Category

📝 Abstract
Wireless baseband processing (WBP) is a key element of wireless communications, with a series of signal processing modules to improve data throughput and counter channel fading. Conventional hardware solutions, such as digital signal processors (DSPs) and more recently, graphic processing units (GPUs), provide various degrees of parallelism, yet they both fail to take into account the cyclical and consecutive character of WBP. Furthermore, the large amount of data in WBPs cannot be processed quickly in symmetric multiprocessors (SMPs) due to the unpredictability of memory latency. To address this issue, we propose a hierarchical dataflow-driven architecture to accelerate WBP. A pack-and-ship approach is presented under a non-uniform memory access (NUMA) architecture to allow the subordinate tiles to operate in a bundled access and execute manner. We also propose a multi-level dataflow model and the related scheduling scheme to manage and allocate the heterogeneous hardware resources. Experiment results demonstrate that our prototype achieves $2 imes$ and $2.3 imes$ speedup in terms of normalized throughput and single-tile clock cycles compared with GPU and DSP counterparts in several critical WBP benchmarks. Additionally, a link-level throughput of $288$ Mbps can be achieved with a $45$-core configuration.
Problem

Research questions and friction points this paper is trying to address.

Accelerate wireless baseband processing with hierarchical dataflow architecture
Address memory latency unpredictability in symmetric multiprocessors
Optimize heterogeneous resource allocation via multi-level dataflow model
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical dataflow-driven heterogeneous architecture for WBP
Pack-and-ship approach under NUMA architecture
Multi-level dataflow model with scheduling scheme
🔎 Similar Papers
No similar papers found.
Limin Jiang
Limin Jiang
School of Communication and Information Engineering, Shanghai University, Shanghai, China
Y
Yi-xing Shi
School of Communication and Information Engineering, Shanghai University, Shanghai, China
H
Haiqin Hu
School of Communication and Information Engineering, Shanghai University, Shanghai, China
Q
Qingyu Deng
School of Communication and Information Engineering, Shanghai University, Shanghai, China
S
Siyi Xu
School of Communication and Information Engineering, Shanghai University, Shanghai, China
Y
Yintao Liu
School of Communication and Information Engineering, Shanghai University, Shanghai, China
Feng Yuan
Feng Yuan
Postdoctoral Fellow of Computer Science and Engineering, The Chinese University of Hong Kong
Computer Aided DesignFault-Tolerant Computing
Si Wang
Si Wang
School of Communication and Information Engineering, Shanghai University, Shanghai, China
Y
Yihao Shen
School of Communication and Information Engineering, Shanghai University, Shanghai, China
F
Fangfang Ye
School of Communication and Information Engineering, Shanghai University, Shanghai, China
Shan Cao
Shan Cao
Shanghai University
ASICWireless communication systemsmachine learning acceleration
Z
Zhiyuan Jiang
School of Communication and Information Engineering, Shanghai University, Shanghai, China