🤖 AI Summary
This work addresses the joint optimization of 3D scene reconstruction and deblurring from motion-blurred images. We propose the first end-to-end 3D Gaussian Splatting (3D-GS) framework that eliminates the need for Structure-from-Motion (SfM)-based pose estimation. Our method innovatively fuses high-temporal-resolution event streams from an event camera with binocular stereo matching (DUSt3R) to directly generate an initial point cloud from blurred input images. Concurrently, the event stream implicitly decodes latent sharp frames, providing fine-grained supervision for reconstruction. This design circumvents cumulative pose errors inherent in traditional SfM pipelines and enables robust, differentiable deblurring rendering. Experimental results demonstrate that our approach significantly outperforms state-of-the-art deblurring-aware 3D-GS methods in novel-view synthesis quality, rendering efficiency, and fidelity of 3D reconstruction under motion blur.
📝 Abstract
In this paper, we propose the first Structure-from-Motion (SfM)-free deblurring 3D Gaussian Splatting method via event camera, dubbed DeblurSplat. We address the motion-deblurring problem in two ways. First, we leverage the pretrained capability of the dense stereo module (DUSt3R) to directly obtain accurate initial point clouds from blurred images. Without calculating camera poses as an intermediate result, we avoid the cumulative errors transfer from inaccurate camera poses to the initial point clouds' positions. Second, we introduce the event stream into the deblur pipeline for its high sensitivity to dynamic change. By decoding the latent sharp images from the event stream and blurred images, we can provide a fine-grained supervision signal for scene reconstruction optimization. Extensive experiments across a range of scenes demonstrate that DeblurSplat not only excels in generating high-fidelity novel views but also achieves significant rendering efficiency compared to the SOTAs in deblur 3D-GS.