GCRayDiffusion: Pose-Free Surface Reconstruction via Geometric Consistent Ray Diffusion

๐Ÿ“… 2025-03-28
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper addresses sparse-view surface reconstruction without ground-truth camera pose labels. We propose an end-to-end, pose-agnostic method that jointly optimizes implicit signed distance function (SDF) geometry and camera posesโ€”without requiring pose priors or explicit pose estimation. Our approach introduces three key innovations: (1) modeling unknown camera poses as learnable neural ray bundles; (2) designing a geometrically consistent ray diffusion model (GCRayDiffusion), conditioned on triplane-based SDFs to guide pose denoising; and (3) enforcing multi-view geometric consistency via surface-aware geometric regularization at ray sampling points. By unifying implicit surface representation learning with pose optimization in a single differentiable framework, our method achieves significantly improved geometric accuracy and cross-view consistency under sparse-view settings. Quantitatively, it reduces pose estimation error substantially compared to existing unsupervised methods. This work establishes a new paradigm for pose-free NeRF and implicit reconstruction.

Technology Category

Application Category

๐Ÿ“ Abstract
Accurate surface reconstruction from unposed images is crucial for efficient 3D object or scene creation. However, it remains challenging, particularly for the joint camera pose estimation. Previous approaches have achieved impressive pose-free surface reconstruction results in dense-view settings, but could easily fail for sparse-view scenarios without sufficient visual overlap. In this paper, we propose a new technique for pose-free surface reconstruction, which follows triplane-based signed distance field (SDF) learning but regularizes the learning by explicit points sampled from ray-based diffusion of camera pose estimation. Our key contribution is a novel Geometric Consistent Ray Diffusion model (GCRayDiffusion), where we represent camera poses as neural bundle rays and regress the distribution of noisy rays via a diffusion model. More importantly, we further condition the denoising process of RGRayDiffusion using the triplane-based SDF of the entire scene, which provides effective 3D consistent regularization to achieve multi-view consistent camera pose estimation. Finally, we incorporate RGRayDiffusion into the triplane-based SDF learning by introducing on-surface geometric regularization from the sampling points of the neural bundle rays, which leads to highly accurate pose-free surface reconstruction results even for sparse-view inputs. Extensive evaluations on public datasets show that our GCRayDiffusion achieves more accurate camera pose estimation than previous approaches, with geometrically more consistent surface reconstruction results, especially given sparse-view inputs.
Problem

Research questions and friction points this paper is trying to address.

Pose-free 3D surface reconstruction from unposed images
Joint camera pose estimation in sparse-view scenarios
Geometric consistent regularization for multi-view pose estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Triplane-based SDF learning with ray diffusion
Geometric Consistent Ray Diffusion model
On-surface geometric regularization for accuracy
๐Ÿ”Ž Similar Papers
No similar papers found.