Quantile Rendering: Efficiently Embedding High-dimensional Feature on 3D Gaussian Splatting

📅 2025-12-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D Gaussian Splatting (3D-GS) methods struggle with efficient rendering of high-dimensional semantic features for open-vocabulary segmentation (OVS), while conventional compression or codebook-based approaches incur significant information loss and accuracy degradation. To address this, we propose Q-Render—a sparse rendering paradigm that quantizes ray sampling to retain only view-dominant Gaussians, coupled with differentiable sparse integration and high-fidelity mapping of high-dimensional features, thereby eliminating volumetric rendering redundancy and distortion. We further design GS-Net, a lightweight network enabling end-to-end joint prediction of 3D Gaussian parameters and semantic features. Evaluated on ScanNet and LeRF, our method achieves substantial improvements over state-of-the-art approaches: it enables real-time rendering of 512-dimensional features—43.7× faster than prior work—while significantly boosting segmentation accuracy.

Technology Category

Application Category

📝 Abstract
Recent advancements in computer vision have successfully extended Open-vocabulary segmentation (OVS) to the 3D domain by leveraging 3D Gaussian Splatting (3D-GS). Despite this progress, efficiently rendering the high-dimensional features required for open-vocabulary queries poses a significant challenge. Existing methods employ codebooks or feature compression, causing information loss, thereby degrading segmentation quality. To address this limitation, we introduce Quantile Rendering (Q-Render), a novel rendering strategy for 3D Gaussians that efficiently handles high-dimensional features while maintaining high fidelity. Unlike conventional volume rendering, which densely samples all 3D Gaussians intersecting each ray, Q-Render sparsely samples only those with dominant influence along the ray. By integrating Q-Render into a generalizable 3D neural network, we also propose Gaussian Splatting Network (GS-Net), which predicts Gaussian features in a generalizable manner. Extensive experiments on ScanNet and LeRF demonstrate that our framework outperforms state-of-the-art methods, while enabling real-time rendering with an approximate ~43.7x speedup on 512-D feature maps. Code will be made publicly available.
Problem

Research questions and friction points this paper is trying to address.

Efficiently rendering high-dimensional features for 3D open-vocabulary segmentation
Overcoming information loss from codebooks or compression in existing methods
Achieving real-time rendering while maintaining high fidelity in 3D Gaussian Splatting
Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantile Rendering sparsely samples dominant Gaussians along rays
GS-Net predicts Gaussian features in a generalizable manner
Framework enables real-time rendering with significant speedup
🔎 Similar Papers
No similar papers found.
Yoonwoo Jeong
Yoonwoo Jeong
NVIDIA, POSTECH
Computer Vision
C
Cheng Sun
NVIDIA
F
Frank Wang
NVIDIA
M
Minsu Cho
POSTECH
Jaesung Choe
Jaesung Choe
NVIDIA
3D computer vision