ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the trade-off between efficiency and fidelity in real-time monocular 3D reconstruction from image sequences, this paper proposes a novel framework that unifies feed-forward inference speed with SLAM-level accuracy. Methodologically, it introduces (1) a hierarchical 3D Gaussian representation coupled with Level-of-Detail (LoD)-aware rendering to explicitly model geometry-appearance correlations—enabling significant redundancy reduction while preserving fine-grained fidelity; and (2) joint camera pose and initial point cloud estimation via a 3D foundation model, followed by a learnable Gaussian decoder that maps multi-scale features into structured Gaussian distributions. Evaluated on eight indoor and outdoor benchmarks, our method achieves reconstruction quality competitive with per-scene optimized state-of-the-art methods, while sustaining SLAM-level real-time performance (>30 FPS) and feed-forward robustness. To our knowledge, this is the first approach to simultaneously deliver high fidelity, high efficiency, and high stability in large-scale, complex scenes.

Technology Category

Application Category

📝 Abstract
On-the-fly 3D reconstruction from monocular image sequences is a long-standing challenge in computer vision, critical for applications such as real-to-sim, AR/VR, and robotics. Existing methods face a major tradeoff: per-scene optimization yields high fidelity but is computationally expensive, whereas feed-forward foundation models enable real-time inference but struggle with accuracy and robustness. In this work, we propose ARTDECO, a unified framework that combines the efficiency of feed-forward models with the reliability of SLAM-based pipelines. ARTDECO uses 3D foundation models for pose estimation and point prediction, coupled with a Gaussian decoder that transforms multi-scale features into structured 3D Gaussians. To sustain both fidelity and efficiency at scale, we design a hierarchical Gaussian representation with a LoD-aware rendering strategy, which improves rendering fidelity while reducing redundancy. Experiments on eight diverse indoor and outdoor benchmarks show that ARTDECO delivers interactive performance comparable to SLAM, robustness similar to feed-forward systems, and reconstruction quality close to per-scene optimization, providing a practical path toward on-the-fly digitization of real-world environments with both accurate geometry and high visual fidelity. Explore more demos on our project page: https://city-super.github.io/artdeco/.
Problem

Research questions and friction points this paper is trying to address.

Achieving efficient high-fidelity 3D reconstruction from monocular sequences
Balancing computational efficiency with reconstruction accuracy in real-time
Enabling robust on-the-fly digitization with accurate geometry and visual fidelity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines feed-forward models with SLAM pipelines
Uses 3D foundation models for pose estimation
Employs hierarchical Gaussian representation with LoD rendering
🔎 Similar Papers
No similar papers found.
Guanghao Li
Guanghao Li
Fudan University
Graphics
Kerui Ren
Kerui Ren
Shanghai Jiao Tong University, Shanghai AI Laboratory
3D ReconstructionNeural Rendering
L
Linning Xu
Shanghai Artificial Intelligence Laboratory, The Chinese University of Hong Kong
Z
Zhewen Zheng
Shanghai Artificial Intelligence Laboratory, Carnegie Mellon University
C
Changjian Jiang
Shanghai Artificial Intelligence Laboratory, Zhejiang University
X
Xin Gao
Shanghai Artificial Intelligence Laboratory, Fudan University
B
Bo Dai
The University of Hong Kong
Jian Pu
Jian Pu
Institute of Science and Technology for Brain-inspired Intelligence, Fudan University
Autonomous SystemsComputer VisionMachine Learning
Mulin Yu
Mulin Yu
Shanghai AILab; INRIA
3D reconstruction and 3D repairing
J
Jiangmiao Pang
Shanghai Artificial Intelligence Laboratory