GEMM-GS: Accelerating 3D Gaussian Splatting on Tensor Cores with GEMM-Compatible Blending

📅 2026-04-02

📈 Citations: 0

✨ Influential: 0

career value

243K/year

🤖 AI Summary

This work addresses the challenge that 3D Gaussian Splatting (3DGS) fails to leverage modern GPU Tensor Cores effectively, hindering its applicability in real-time rendering scenarios. To overcome this limitation, the authors present the first formulation that equivalently reformulates the rasterization blending operations of 3DGS into a general matrix multiplication (GEMM) structure, thereby enabling acceleration via Tensor Cores. They further design high-performance CUDA kernels integrated with a three-stage double-buffered pipeline to overlap computation and memory transfers. The proposed method achieves a 1.42× speedup over the original 3DGS implementation and, when combined with existing optimization techniques, yields an additional average performance gain of 1.47×, significantly enhancing the real-time rendering potential of 3DGS.

Technology Category

Application Category

📝 Abstract

Neural Radiance Fields (NeRF) enables 3D scene reconstruction from several 2D images but incurs high rendering latency via its point-sampling design. 3D Gaussian Splatting (3DGS) improves on NeRF with explicit scene representation and an optimized pipeline yet still fails to meet practical real-time demands. Existing acceleration works overlook the evolving Tensor Cores of modern GPUs because 3DGS pipeline lacks General Matrix Multiplication (GEMM) operations. This paper proposes GEMM-GS, an acceleration approach utilizing tensor cores on GPUs via GEMM-friendly blending transformation. It equivalently reformulates the 3DGS blending process into a GEMM-compatible form to utilize Tensor Cores. A high-performance CUDA kernel is designed, integrating a three-stage double-buffered pipeline that overlaps computation and memory access. Extensive experiments show that GEMM-GS achieves $1.42\times$ speedup over vanilla 3DGS and provides an additional $1.47\times$ speedup on average when combining with existing acceleration approaches. Code is released at https://github.com/shieldforever/GEMM-GS.

Problem

Research questions and friction points this paper is trying to address.

3D Gaussian Splatting

Tensor Cores

GEMM

real-time rendering

GPU acceleration

Innovation

Methods, ideas, or system contributions that make the work stand out.

GEMM

Tensor Cores

3D Gaussian Splatting