🤖 AI Summary
Existing feedforward 3D Gaussian splatting models struggle to achieve high-quality reconstruction without subsequent optimization due to the scarcity of large-scale 3D annotations, and their training objectives are misaligned with downstream optimizers. This work proposes an optimization-aware training framework that introduces a MetaGrad mechanism to provide supervision signals derived from short-horizon inner-loop optimization trajectories—without requiring higher-order differentiation—enabling the model to produce initial representations specifically tailored for efficient optimization. By offloading part of the modeling burden to the optimizer, the approach reduces the required network capacity. Experiments on AnySplat, Pi3X, and lightweight distilled architectures demonstrate that ForeSplat initialization surpasses the fully converged reconstruction quality of original models in just a few optimization steps, significantly boosting peak performance and reducing optimization time.
📝 Abstract
Feed-forward 3D Gaussian Splatting (3DGS) models offer fast single-pass reconstruction,but scaling them to match per-scene optimization quality is fundamentally hindered by the scarcity of large-scale 3D annotations.A practical compromise is predict-then-refine,where post-prediction optimization compensates for the limited capacity of the feed-forward network.However,standard feed-forward 3DGS is trained solely for zero-step rendering error,ignoring whether its output constitutes a good initialization for the downstream optimizer.We present ForeSplat,an optimization-aware training framework that equips feed-forward 3DGS models to produce initializations explicitly designed for rapid,effective refinement.By offloading part of the scene-modeling burden to the optimizer,ForeSplat substantially reduces the capacity pressure on the feed-forward model,making high-quality reconstruction feasible even with compact networks.At its core is MetaGrad,a lightweight multi-anchor meta-gradient training rule that bypasses costly higher-order differentiation through the 3DGS optimizer.MetaGrad unrolls a short inner-loop refinement trajectory,samples anchor states,and back-propagates aggregated first-order gradients to the prediction head as a surrogate optimization-aware signal.This fine-tuning adds no inference cost and enables high-quality reconstruction within seconds after a few refinement steps.We instantiate ForeSplat on diverse backbones,including AnySplat,Pi3X,and a distilled variant tailored for edge deployment.Across all tested architectures,a ForeSplat-trained initialization converges in fewer refinement steps and reaches a higher peak reconstruction quality than its vanilla counterpart,even fully converged.The framework consistently bridges the gap between amortized prediction and per-scene optimization,establishing a practical path toward lightweight,high-fidelity 3D reconstruction.