🤖 AI Summary
Gaussian Splatting enables efficient high-fidelity 3D scene reconstruction, yet its implicit volumetric representation hinders precise and efficient extraction of faithful surface meshes: existing post-processing methods suffer from detail loss, high computational cost, and often yield meshes with millions of vertices. This paper introduces the first end-to-end framework that embeds a triangle mesh directly into the Gaussian training loop, enabling joint differentiable optimization of Gaussians and mesh geometry. Our approach leverages differentiable Delaunay triangulation, bidirectional consistency constraints, and Gaussian-kernel-driven signed distance estimation to dynamically generate geometrically coherent surfaces directly from Gaussian parameters. While preserving state-of-the-art reconstruction quality, our method reduces mesh vertex count by an order of magnitude—yielding compact, high-fidelity meshes significantly better suited for downstream tasks such as physics simulation and animation.
📝 Abstract
While recent advances in Gaussian Splatting have enabled fast reconstruction of high-quality 3D scenes from images, extracting accurate surface meshes remains a challenge. Current approaches extract the surface through costly post-processing steps, resulting in the loss of fine geometric details or requiring significant time and leading to very dense meshes with millions of vertices. More fundamentally, the a posteriori conversion from a volumetric to a surface representation limits the ability of the final mesh to preserve all geometric structures captured during training. We present MILo, a novel Gaussian Splatting framework that bridges the gap between volumetric and surface representations by differentiably extracting a mesh from the 3D Gaussians. We design a fully differentiable procedure that constructs the mesh-including both vertex locations and connectivity-at every iteration directly from the parameters of the Gaussians, which are the only quantities optimized during training. Our method introduces three key technical contributions: a bidirectional consistency framework ensuring both representations-Gaussians and the extracted mesh-capture the same underlying geometry during training; an adaptive mesh extraction process performed at each training iteration, which uses Gaussians as differentiable pivots for Delaunay triangulation; a novel method for computing signed distance values from the 3D Gaussians that enables precise surface extraction while avoiding geometric erosion. Our approach can reconstruct complete scenes, including backgrounds, with state-of-the-art quality while requiring an order of magnitude fewer mesh vertices than previous methods. Due to their light weight and empty interior, our meshes are well suited for downstream applications such as physics simulations or animation.