PROFusion: Robust and Accurate Dense Reconstruction via Camera Pose Regression and Optimization

📅 2025-09-28

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

To address the challenge of balancing robustness and accuracy in real-time RGB-D SLAM dense reconstruction under severe camera motion (e.g., large viewpoint changes, rapid translation/rotation, or sudden jitter), this paper proposes a learning-optimization co-designed framework. Methodologically, it introduces a novel two-stage paradigm: “learning-driven initialization + geometry-guided stochastic optimization.” First, a lightweight CNN regresses metrically consistent relative poses to provide high-quality initialization for optimization. Second, a stochastic sampling optimization strategy—guided by depth-map geometric consistency—is devised to achieve robust and high-precision depth alignment. Evaluated on dynamic-motion datasets, the method significantly outperforms state-of-the-art approaches; on stable sequences, it matches their accuracy while maintaining real-time performance (>30 FPS). The core contribution lies in establishing a tightly coupled mechanism between learned priors and geometric optimization—enabling, for the first time, simultaneous real-time operation, robustness to extreme motion, and sub-centimeter reconstruction accuracy.

Technology Category

Application Category

📝 Abstract

Real-time dense scene reconstruction during unstable camera motions is crucial for robotics, yet current RGB-D SLAM systems fail when cameras experience large viewpoint changes, fast motions, or sudden shaking. Classical optimization-based methods deliver high accuracy but fail with poor initialization during large motions, while learning-based approaches provide robustness but lack sufficient accuracy for dense reconstruction. We address this challenge through a combination of learning-based initialization with optimization-based refinement. Our method employs a camera pose regression network to predict metric-aware relative poses from consecutive RGB-D frames, which serve as reliable starting points for a randomized optimization algorithm that further aligns depth images with the scene geometry. Extensive experiments demonstrate promising results: our approach outperforms the best competitor on challenging benchmarks, while maintaining comparable accuracy on stable motion sequences. The system operates in real-time, showcasing that combining simple and principled techniques can achieve both robustness for unstable motions and accuracy for dense reconstruction. Project page: https://github.com/siyandong/PROFusion.

Problem

Research questions and friction points this paper is trying to address.

Achieving robust dense reconstruction under unstable camera motions

Overcoming limitations of optimization-based and learning-based SLAM methods

Enabling real-time performance with combined learning and optimization approach

Innovation

Methods, ideas, or system contributions that make the work stand out.

Camera pose regression network predicts relative poses

Randomized optimization refines depth image alignment

Combines learning initialization with optimization refinement

🔎 Similar Papers

HGSLoc: 3DGS-based Heuristic Camera Pose Refinement