🤖 AI Summary
To address the challenges of motion blur in RGB frames and the difficulty of jointly reconstructing dynamic human bodies and static scenes from monocular event camera videos, this paper proposes an event-guided unified 3D Gaussian modeling framework. Our method introduces three key innovations: (1) an event-driven loss function that enhances geometric and textural detail recovery in high-velocity regions; (2) a semantically learnable unified Gaussian representation—jointly optimizing deformable human Gaussians and static scene Gaussians within an end-to-end co-reconstruction framework; and (3) integration of event streams with RGB brightness change priors to improve motion consistency. Evaluated on ZJU-MoCap-Blur and MMHPSD-Blur benchmarks, our approach achieves state-of-the-art performance: significant improvements in PSNR and SSIM, notable reduction in LPIPS, and superior robustness under fast-motion conditions.
📝 Abstract
Reconstructing dynamic humans together with static scenes from monocular videos remains difficult, especially under fast motion, where RGB frames suffer from motion blur. Event cameras exhibit distinct advantages, e.g., microsecond temporal resolution, making them a superior sensing choice for dynamic human reconstruction. Accordingly, we present a novel event-guided human-scene reconstruction framework that jointly models human and scene from a single monocular event camera via 3D Gaussian Splatting. Specifically, a unified set of 3D Gaussians carries a learnable semantic attribute; only Gaussians classified as human undergo deformation for animation, while scene Gaussians stay static. To combat blur, we propose an event-guided loss that matches simulated brightness changes between consecutive renderings with the event stream, improving local fidelity in fast-moving regions. Our approach removes the need for external human masks and simplifies managing separate Gaussian sets. On two benchmark datasets, ZJU-MoCap-Blur and MMHPSD-Blur, it delivers state-of-the-art human-scene reconstruction, with notable gains over strong baselines in PSNR/SSIM and reduced LPIPS, especially for high-speed subjects.