GCC: A 3DGS Inference Architecture with Gaussian-Wise and Cross-Stage Conditional Processing

📅 2025-07-21

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

To address computational redundancy and data movement overheads arising from the decoupling of preprocessing and rendering in mobile 3D Gaussian Splatting (3DGS), this paper proposes GCC, a hardware acceleration architecture. GCC introduces three key innovations: (1) a cross-stage dynamic skipping mechanism that conditionally executes preprocessing based on real-time rendering requirements, thereby avoiding unnecessary Gaussian generation; (2) Gaussian-granularity rendering scheduling, which eliminates redundant tile-wise data reloading; and (3) an alpha-based boundary detection method for precise compression of effective Gaussian regions. Implemented in 28 nm CMOS technology, GCC achieves 2.1× higher energy efficiency and 1.8× higher throughput compared to the state-of-the-art accelerator GSCore. Furthermore, it reduces redundant computation by 63% and memory accesses by 57%.

Technology Category

Application Category

📝 Abstract

3D Gaussian Splatting (3DGS) has emerged as a leading neural rendering technique for high-fidelity view synthesis, prompting the development of dedicated 3DGS accelerators for mobile applications. Through in-depth analysis, we identify two major limitations in the conventional decoupled preprocessing-rendering dataflow adopted by existing accelerators: 1) a significant portion of preprocessed Gaussians are not used in rendering, and 2) the same Gaussian gets repeatedly loaded across different tile renderings, resulting in substantial computational and data movement overhead. To address these issues, we propose GCC, a novel accelerator designed for fast and energy-efficient 3DGS inference. At the dataflow level, GCC introduces: 1) cross-stage conditional processing, which interleaves preprocessing and rendering to dynamically skip unnecessary Gaussian preprocessing; and 2) Gaussian-wise rendering, ensuring that all rendering operations for a given Gaussian are completed before moving to the next, thereby eliminating duplicated Gaussian loading. We also propose an alpha-based boundary identification method to derive compact and accurate Gaussian regions, thereby reducing rendering costs. We implement our GCC accelerator in 28nm technology. Extensive experiments demonstrate that GCC significantly outperforms the state-of-the-art 3DGS inference accelerator, GSCore, in both performance and energy efficiency.

Problem

Research questions and friction points this paper is trying to address.

Reduces unused Gaussian preprocessing in 3DGS rendering

Eliminates duplicated Gaussian loading across tile renderings

Improves 3DGS inference speed and energy efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-stage conditional processing for dynamic skipping

Gaussian-wise rendering to eliminate duplicate loading

Alpha-based boundary identification for compact regions

🔎 Similar Papers

No similar papers found.