Unlimited Vector Processing for Wireless Baseband Based on RISC-V Extension

📅 2025-04-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conventional vector architectures for wireless baseband processing suffer from limited register capacity, rigid power-of-two vector lengths, and inflexible permutation support. To address these limitations, this paper proposes the Unbounded Vector Processing (UVP) architecture, a RISC-V extension. Methodologically, UVP introduces a novel programming model supporting non-power-of-two register grouping and hardware-automated strip-mining; defines symmetric and asymmetric vector instruction classes with customized memory-access strategies; and integrates a highly robust permutation engine alongside a fixed-point-optimized pipeline. Implemented in SMIC 40 nm CMOS, the RTL prototype demonstrates 3.0× and 2.1× speedup over lane-based architectures for matrix multiplication and FFT, respectively. Under a 16-lane configuration, the design occupies only 0.94 mm² and achieves an energy efficiency of 21.2 GOPS/mm².

Technology Category

Application Category

📝 Abstract
Wireless baseband processing (WBP) serves as an ideal scenario for utilizing vector processing, which excels in managing data-parallel operations due to its parallel structure. However, conventional vector architectures face certain constraints such as limited vector register sizes, reliance on power-of-two vector length multipliers, and vector permutation capabilities tied to specific architectures. To address these challenges, we have introduced an instruction set extension (ISE) based on RISC-V known as unlimited vector processing (UVP). This extension enhances both the flexibility and efficiency of vector computations. UVP employs a novel programming model that supports non-power-of-two register groupings and hardware strip-mining, thus enabling smooth handling of vectors of varying lengths while reducing the software strip-mining burden. Vector instructions are categorized into symmetric and asymmetric classes, complemented by specialized load/store strategies to optimize execution. Moreover, we present a hardware implementation of UVP featuring sophisticated hazard detection mechanisms, optimized pipelines for symmetric tasks such as fixed-point multiplication and division, and a robust permutation engine for effective asymmetric operations. Comprehensive evaluations demonstrate that UVP significantly enhances performance, achieving up to 3.0$ imes$ and 2.1$ imes$ speedups in matrix multiplication and fast Fourier transform (FFT) tasks, respectively, when measured against lane-based vector architectures. Our synthesized RTL for a 16-lane configuration using SMIC 40nm technology spans 0.94 mm$^2$ and achieves an area efficiency of 21.2 GOPS/mm$^2$.
Problem

Research questions and friction points this paper is trying to address.

Enhancing flexibility in wireless baseband vector processing
Overcoming limitations of conventional vector architectures
Improving efficiency with RISC-V based UVP extension
Innovation

Methods, ideas, or system contributions that make the work stand out.

RISC-V extension enables unlimited vector processing
Novel programming model supports flexible vector lengths
Optimized hardware with hazard detection and pipelines
🔎 Similar Papers
No similar papers found.
Limin Jiang
Limin Jiang
Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China
Y
Yi Shi
Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China
Y
Yihao Shen
Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China
Shan Cao
Shan Cao
Shanghai University
ASICWireless communication systemsmachine learning acceleration
Z
Zhiyuan Jiang
Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China
S
Sheng Zhou
Beijing National Research Center for Information Science and Technology, Department of Electronic Engineering, Tsinghua University, Beijing 100084, China