A Hierarchical Dataflow-Driven Heterogeneous Architecture for Wireless Baseband Processing

📅 2024-02-28

🏛️ Asia and South Pacific Design Automation Conference

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

Wireless baseband processing (WBP) imposes stringent requirements—periodicity, pipeline parallelism, and high throughput—that conventional DSP/GPU architectures fail to meet due to their neglect of WBP’s inherent dataflow characteristics and uncontrollable NUMA memory latency. To address this, we propose a hierarchical dataflow-driven heterogeneous hardware architecture: (1) a novel multi-level dataflow model explicitly tailored to WBP signal chains; (2) a NUMA-aware “pack-and-send” memory access paradigm; and (3) fine-grained, tile-level heterogeneous unit scheduling coordinated by dataflow semantics. Evaluated on a 45-core prototype, our design achieves a 288 Mbps link throughput, delivering 2.0× and 2.3× higher normalized throughput than GPU and DSP baselines, respectively, while reducing per-tile clock cycles by 2.3×. This work presents the first systematic co-optimization framework integrating dataflow modeling and architectural design for WBP.

Technology Category

Application Category

📝 Abstract

Wireless baseband processing (WBP) is a key element of wireless communications, with a series of signal processing modules to improve data throughput and counter channel fading. Conventional hardware solutions, such as digital signal processors (DSPs) and more recently, graphic processing units (GPUs), provide various degrees of parallelism, yet they both fail to take into account the cyclical and consecutive character of WBP. Furthermore, the large amount of data in WBPs cannot be processed quickly in symmetric multiprocessors (SMPs) due to the unpredictability of memory latency. To address this issue, we propose a hierarchical dataflow-driven architecture to accelerate WBP. A pack-and-ship approach is presented under a non-uniform memory access (NUMA) architecture to allow the subordinate tiles to operate in a bundled access and execute manner. We also propose a multi-level dataflow model and the related scheduling scheme to manage and allocate the heterogeneous hardware resources. Experiment results demonstrate that our prototype achieves $2 imes$ and $2.3 imes$ speedup in terms of normalized throughput and single-tile clock cycles compared with GPU and DSP counterparts in several critical WBP benchmarks. Additionally, a link-level throughput of $288$ Mbps can be achieved with a $45$-core configuration.

Problem

Research questions and friction points this paper is trying to address.

Accelerate wireless baseband processing with hierarchical dataflow architecture

Address memory latency unpredictability in symmetric multiprocessors

Optimize heterogeneous resource allocation via multi-level dataflow model

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical dataflow-driven heterogeneous architecture for WBP

Pack-and-ship approach under NUMA architecture

Multi-level dataflow model with scheduling scheme

🔎 Similar Papers

No similar papers found.

Authors to Follow