🤖 AI Summary
Existing methods struggle to learn discriminative and multi-view-consistent features under sparse-view conditions, leading to poor generalization and inaccurate geometry reconstruction in Gaussian Splatting (GS). To address this, we propose C³-GS—a context-aware, cross-dimensional, and cross-scale lightweight feature learning framework that jointly models pixel-level Gaussian parameters without additional supervision. By enforcing geometric consistency and enhancing rendering photorealism, C³-GS enables high-fidelity novel view synthesis without per-scene optimization. Evaluated on standard benchmarks, C³-GS achieves state-of-the-art performance in both rendering quality and cross-scene generalization, outperforming prior approaches significantly. Moreover, it strikes an effective balance between computational efficiency and reconstruction fidelity, demonstrating robustness across diverse scenes with minimal views.
📝 Abstract
Generalizable Gaussian Splatting aims to synthesize novel views for unseen scenes without per-scene optimization. In particular, recent advancements utilize feed-forward networks to predict per-pixel Gaussian parameters, enabling high-quality synthesis from sparse input views. However, existing approaches fall short in encoding discriminative, multi-view consistent features for Gaussian predictions, which struggle to construct accurate geometry with sparse views. To address this, we propose $mathbf{C}^{3}$-GS, a framework that enhances feature learning by incorporating context-aware, cross-dimension, and cross-scale constraints. Our architecture integrates three lightweight modules into a unified rendering pipeline, improving feature fusion and enabling photorealistic synthesis without requiring additional supervision. Extensive experiments on benchmark datasets validate that $mathbf{C}^{3}$-GS achieves state-of-the-art rendering quality and generalization ability. Code is available at: https://github.com/YuhsiHu/C3-GS.