Feed-Forward 3D Gaussian Splatting Compression with Long-Context Modeling

📅 2025-11-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing compression methods for 3D Gaussian Splatting (3DGS) suffer from limited modeling capacity for long-range spatial dependencies, primarily due to narrow receptive fields in transform coding networks and insufficient context capacity in entropy models—challenges exacerbated by the large-scale nature of 3DGS data. Method: We propose the first feed-forward 3DGS compression framework. It constructs a large-scale spatial context structure via Morton-order indexing, designs a joint spatial-channel autoregressive entropy model, and introduces an attention-driven transform coding network to jointly overcome receptive-field and contextual modeling bottlenecks. Contribution/Results: Our method achieves up to 20× compression ratio under feed-forward inference, significantly outperforming existing generalizable 3DGS compression approaches and establishing new state-of-the-art performance.

Technology Category

Application Category

📝 Abstract
3D Gaussian Splatting (3DGS) has emerged as a revolutionary 3D representation. However, its substantial data size poses a major barrier to widespread adoption. While feed-forward 3DGS compression offers a practical alternative to costly per-scene per-train compressors, existing methods struggle to model long-range spatial dependencies, due to the limited receptive field of transform coding networks and the inadequate context capacity in entropy models. In this work, we propose a novel feed-forward 3DGS compression framework that effectively models long-range correlations to enable highly compact and generalizable 3D representations. Central to our approach is a large-scale context structure that comprises thousands of Gaussians based on Morton serialization. We then design a fine-grained space-channel auto-regressive entropy model to fully leverage this expansive context. Furthermore, we develop an attention-based transform coding model to extract informative latent priors by aggregating features from a wide range of neighboring Gaussians. Our method yields a $20 imes$ compression ratio for 3DGS in a feed-forward inference and achieves state-of-the-art performance among generalizable codecs.
Problem

Research questions and friction points this paper is trying to address.

Compress 3D Gaussian Splatting data efficiently
Model long-range spatial dependencies in 3DGS compression
Achieve high compression ratio with feed-forward inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale context structure with Morton serialization
Fine-grained space-channel auto-regressive entropy model
Attention-based transform coding for wide neighbor feature aggregation
🔎 Similar Papers
No similar papers found.
Z
Zhening Liu
Hong Kong University of Science and Technology
R
Rui Song
Hong Kong University of Science and Technology
Yushi Huang
Yushi Huang
Hong Kong University of Science and Technology
Efficient AI
Yingdong Hu
Yingdong Hu
Institute for Interdisciplinary Information Sciences, Tsinghua University
computer visionrobotics
Xinjie Zhang
Xinjie Zhang
Researcher, Microsoft Research Asia
Multimodal Understanding and GenerationNeural CompressionGaussian Splatting
J
Jiawei Shao
Hong Kong University of Science and Technology, Institute of Artificial Intelligence (TeleAI), China Telecom
Zehong Lin
Zehong Lin
Research Assistant Professor, Hong Kong University of Science and Technology
Edge AIMachine Learning
J
Jun Zhang
Hong Kong University of Science and Technology