🤖 AI Summary
To address low reconstruction accuracy and severe distortion—particularly in texture-rich regions—under sparse-frame 4D dynamic scene capture, this paper proposes the first 4D Gaussian Splatting framework tailored for sparse inputs. Our method introduces three key innovations: (1) a texture-aware deformation regularization to suppress dynamic deformation artifacts; (2) canonical-space optimization coupled with a texture-gradient-based depth alignment loss to improve geometric consistency; and (3) a texture-noise-guided gradient optimization mechanism to enhance fine-detail recovery. Evaluated on NeRF-Synthetic, HyperNeRF, NeRF-DS, and our newly introduced iPhone-4D dataset, the approach achieves state-of-the-art performance using only 3–8 input views. It significantly outperforms existing dynamic and few-shot reconstruction methods, yielding average improvements of +2.1 dB in PSNR and +0.032 in SSIM specifically within texture-rich regions.
📝 Abstract
Dynamic Gaussian Splatting approaches have achieved remarkable performance for 4D scene reconstruction. However, these approaches rely on dense-frame video sequences for photorealistic reconstruction. In real-world scenarios, due to equipment constraints, sometimes only sparse frames are accessible. In this paper, we propose Sparse4DGS, the first method for sparse-frame dynamic scene reconstruction. We observe that dynamic reconstruction methods fail in both canonical and deformed spaces under sparse-frame settings, especially in areas with high texture richness. Sparse4DGS tackles this challenge by focusing on texture-rich areas. For the deformation network, we propose Texture-Aware Deformation Regularization, which introduces a texture-based depth alignment loss to regulate Gaussian deformation. For the canonical Gaussian field, we introduce Texture-Aware Canonical Optimization, which incorporates texture-based noise into the gradient descent process of canonical Gaussians. Extensive experiments show that when taking sparse frames as inputs, our method outperforms existing dynamic or few-shot techniques on NeRF-Synthetic, HyperNeRF, NeRF-DS, and our iPhone-4D datasets.