🤖 AI Summary
This work addresses the limitations of existing splatting-based methods for 3D novel view synthesis in terms of reconstruction quality and representation efficiency. The authors propose a differentiable, end-to-end framework that, for the first time, enables automatic learning of view-dependent 2D splatting kernels. Their approach represents volumetric primitives using ellipsoidal geometry and 3D kernel latent vectors, and jointly optimizes neural networks and primitive attributes through a projection network, a Mahalanobis distance–driven radially symmetric kernel, and a differentiable splatting mechanism. This framework not only supports general 2D kernel learning but also significantly outperforms current analytical and learned-kernel methods on standard benchmarks, achieving state-of-the-art performance in both reconstruction fidelity and representation efficiency.
📝 Abstract
We present a differentiable framework to automatically learn view-dependent 2D kernels in a splatting-based pipeline to improve reconstruction quality and representation efficiency for novel 3D view synthesis. Our volumetric primitive is defined as a bounding ellipsoid and a 3D-kernel latent vector. We first learn a projection network to output a 2D-kernel latent, taking the attributes of the ellipsoid and the 3D-kernel latent as input. Next, the result is sent to a decoder to produce a radially symmetric 2D kernel in terms of Mahalanobis distance, bounded by the projected ellipsoid. The neural networks along with per-primitive attributes are jointly optimized. The effectiveness of our approach is demonstrated on standard benchmarks, comparing favorably against state-of-the-art techniques on both analytical and learned kernels. Finally, we extend the idea to learn general 2D kernels for 2D splatting as well as image representation.