LucidFusion: Reconstructing 3D Gaussians with Arbitrary Unposed Images

📅 2024-10-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limitations of single-image 3D reconstruction—namely, its reliance on accurate camera pose estimation and poor generalization—this paper proposes a pose-free, arbitrary-view 3D Gaussian reconstruction method. The approach eliminates explicit pose estimation by (1) introducing Relative Coordinate Gaussians (RCGs), which encode scene geometry via internal relative spatial relationships rather than absolute camera poses; (2) designing a Relative Coordinate Mapping (RCM) module that enables implicit, calibration-free alignment of unstructured multi-view inputs; and (3) establishing an end-to-end differentiable image-to-3D-Gaussian translation framework, jointly optimizing implicit geometric learning and differentiable rasterization. Our method achieves second-level, high-fidelity 3D reconstruction without pose supervision, significantly improving cross-view consistency and geometric robustness. On standard benchmarks, it attains state-of-the-art accuracy and generalization performance.

Technology Category

Application Category

📝 Abstract
Recent large reconstruction models have made notable progress in generating high-quality 3D objects from single images. However, current reconstruction methods often rely on explicit camera pose estimation or fixed viewpoints, restricting their flexibility and practical applicability. We reformulate 3D reconstruction as image-to-image translation and introduce the Relative Coordinate Map (RCM), which aligns multiple unposed images to a main view without pose estimation. While RCM simplifies the process, its lack of global 3D supervision can yield noisy outputs. To address this, we propose Relative Coordinate Gaussians (RCG) as an extension to RCM, which treats each pixel's coordinates as a Gaussian center and employs differentiable rasterization for consistent geometry and pose recovery. Our LucidFusion framework handles an arbitrary number of unposed inputs, producing robust 3D reconstructions within seconds and paving the way for more flexible, pose-free 3D pipelines.
Problem

Research questions and friction points this paper is trying to address.

Reconstructing 3D objects from unposed images without camera pose estimation.
Addressing noisy outputs in 3D reconstruction due to lack of global supervision.
Enabling flexible, pose-free 3D reconstruction with arbitrary unposed image inputs.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Relative Coordinate Map aligns unposed images
Relative Coordinate Gaussians ensure consistent geometry
LucidFusion enables pose-free 3D reconstruction