Vorion: A RISC-V GPU with Hardware-Accelerated 3D Gaussian Rendering and Training

📅 2025-11-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational overhead and deployment challenges of 3D Gaussian Splatting (3DGS) in real-time rendering on edge devices and workstation-scale 4D reconstruction, this paper presents the first RISC-V-based GPU prototype tailored for 3DGS acceleration. Methodologically, we propose a Gaussian-pixel hybrid dataflow architecture and a z-tiling parallel optimization strategy, enabling seamless integration into standard graphics pipelines with minimal modifications. Implemented in TSMC 16nm technology, our scalable GPGPU architecture incorporates dedicated 3DGS hardware modules. Experimental results demonstrate real-time rendering at 19 FPS on a single chip and a training iteration throughput of 38.6 iterations/second under a 16-rasterizer configuration. To the best of our knowledge, this is the first end-to-end hardware acceleration framework for 3D neural rendering and 4D reconstruction on a RISC-V architecture, delivering an efficient, scalable hardware foundation for intelligent edge vision systems.

Technology Category

Application Category

📝 Abstract
3D Gaussian Splatting (3DGS) has recently emerged as a foundational technique for real-time neural rendering, 3D scene generation, volumetric video (4D) capture. However, its rendering and training impose massive computation, making real-time rendering on edge devices and real-time 4D reconstruction on workstations currently infeasible. Given its fixed-function nature and similarity with traditional rasterization, 3DGS presents a strong case for dedicated hardware in the graphics pipeline of next-generation GPUs. This work, Vorion, presents the first GPGPU prototype with hardware-accelerated 3DGS rendering and training. Vorion features scalable architecture, minimal hardware change to traditional rasterizers, z-tiling to increase parallelism, and Gaussian/pixel-centric hybrid dataflow. We prototype the minimal system (8 SIMT cores, 2 Gaussian rasterizer) using TSMC 16nm FinFET technology, which achieves 19 FPS for rendering. The scaled design with 16 rasterizers achieves 38.6 iterations/s for training.
Problem

Research questions and friction points this paper is trying to address.

Accelerating 3D Gaussian Splatting rendering and training for real-time performance
Enabling real-time neural rendering on edge devices and workstations
Designing dedicated GPU hardware for efficient 3DGS computation
Innovation

Methods, ideas, or system contributions that make the work stand out.

RISC-V GPU with hardware-accelerated 3D Gaussian rendering
Scalable architecture with minimal changes to traditional rasterizers
Hybrid dataflow combining Gaussian and pixel-centric approaches
🔎 Similar Papers
No similar papers found.
Yipeng Wang
Yipeng Wang
College of Computer Science, Beijing University of Technology
Computer NetworksArtificial Intelligence
M
Mengtian Yang
University of Texas at Austin, Austin, TX
C
Chieh-pu Lo
University of Texas at Austin, Austin, TX
J
Jaydeep P. Kulkarni
University of Texas at Austin, Austin, TX