Monocular Online Reconstruction with Enhanced Detail Preservation

📅 2025-05-11

📈 Citations: 0

✨ Influential: 0

career value

245K/year

🤖 AI Summary

This paper addresses key challenges in monocular video-based online 3D reconstruction—including absence of depth supervision, inaccurate Gaussian distribution modeling, and local-global inconsistency—by proposing a real-time, RGB-only Gaussian mapping method. The approach introduces three core contributions: (1) a hierarchical Gaussian management module that enables dynamic, scale-adaptive 3D Gaussian ellipsoid placement; (2) a compact spatial representation based on multi-level occupancy hash voxels (MOHV); and (3) a global consistency optimization framework jointly enforcing photometric and geometric constraints. Crucially, the method requires neither depth maps nor pre-trained models and seamlessly integrates with standard visual odometry pipelines. It achieves real-time performance (>20 FPS) while significantly improving geometric accuracy and texture fidelity. Extensive evaluation demonstrates state-of-the-art results across multiple benchmarks, outperforming existing online RGB and RGB-D reconstruction methods.

Technology Category

Application Category

📝 Abstract

We propose an online 3D Gaussian-based dense mapping framework for photorealistic details reconstruction from a monocular image stream. Our approach addresses two key challenges in monocular online reconstruction: distributing Gaussians without relying on depth maps and ensuring both local and global consistency in the reconstructed maps. To achieve this, we introduce two key modules: the Hierarchical Gaussian Management Module for effective Gaussian distribution and the Global Consistency Optimization Module for maintaining alignment and coherence at all scales. In addition, we present the Multi-level Occupancy Hash Voxels (MOHV), a structure that regularizes Gaussians for capturing details across multiple levels of granularity. MOHV ensures accurate reconstruction of both fine and coarse geometries and textures, preserving intricate details while maintaining overall structural integrity. Compared to state-of-the-art RGB-only and even RGB-D methods, our framework achieves superior reconstruction quality with high computational efficiency. Moreover, it integrates seamlessly with various tracking systems, ensuring generality and scalability.

Problem

Research questions and friction points this paper is trying to address.

Online 3D reconstruction from monocular images without depth maps

Ensuring local and global consistency in reconstructed 3D maps

Preserving photorealistic details across multiple granularity levels

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Gaussian Management for effective distribution

Global Consistency Optimization for multi-scale alignment

Multi-level Occupancy Hash Voxels for granular detail capture

🔎 Similar Papers

Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View