๐ค AI Summary
Reconstructing high-fidelity, animatable 3D human avatars from motion-blurred monocular video remains challenging due to inadequate modeling of self-motion-induced blur. To address this, we introduce the first trajectory-aware extension of the 3D Gaussian Splatting framework for human motion, proposing motion-driven exposure-time trajectory modeling and pose-aware adaptive fusion of blurred/sharp regions. Our method jointly optimizes the geometric representation and blur formation process via differentiable 3D Gaussian rendering, pose-guided spatiotemporal blur modeling, and pose-dependent feature fusion. Evaluated on both synthetic and real-world datasets, our approach achieves significant improvements in PSNR, SSIM, and LPIPSโenabling high-quality real-time rendering. Reconstructed models exhibit sharp textures and natural animations, effectively overcoming the key limitation of existing methods: insufficient modeling of subject-intrinsic motion blur.
๐ Abstract
We introduce Deblur-Avatar, a novel framework for modeling high-fidelity, animatable 3D human avatars from motion-blurred monocular video inputs. Motion blur is prevalent in real-world dynamic video capture, especially due to human movements in 3D human avatar modeling. Existing methods either (1) assume sharp image inputs, failing to address the detail loss introduced by motion blur, or (2) mainly consider blur by camera movements, neglecting the human motion blur which is more common in animatable avatars. Our proposed approach integrates a human movement-based motion blur model into 3D Gaussian Splatting (3DGS). By explicitly modeling human motion trajectories during exposure time, we jointly optimize the trajectories and 3D Gaussians to reconstruct sharp, high-quality human avatars. We employ a pose-dependent fusion mechanism to distinguish moving body regions, optimizing both blurred and sharp areas effectively. Extensive experiments on synthetic and real-world datasets demonstrate that Deblur-Avatar significantly outperforms existing methods in rendering quality and quantitative metrics, producing sharp avatar reconstructions and enabling real-time rendering under challenging motion blur conditions.