Online3R: Online Learning for Consistent Sequential Reconstruction Based on Geometry Foundation Model

πŸ“… 2026-04-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

200K/year
πŸ€– AI Summary
This work addresses the challenges of geometric inconsistency caused by scene changes in sequential 3D reconstruction, as well as the absence of ground truth and the need for efficient online adaptation during inference. To tackle these issues, the authors propose an online learning framework built upon a frozen, pre-trained geometric foundation model. By incorporating lightweight learnable visual prompts and leveraging a local-global self-supervised strategy combined with keyframe consistency optimization, the method enables efficient and consistent scene adaptation without requiring ground truth supervision. Evaluated across multiple benchmarks, the approach significantly outperforms existing state-of-the-art methods, achieving notably improved reconstruction accuracy and consistency while preserving the foundation model’s inherent generalization capabilities.

Technology Category

Application Category

πŸ“ Abstract
We present Online3R, a new sequential reconstruction framework that is capable of adapting to new scenes through online learning, effectively resolving inconsistency issues. Specifically, we introduce a set of learnable lightweight visual prompts into a pretrained, frozen geometry foundation model to capture the knowledge of new environments while preserving the fundamental capability of the foundation model for geometry prediction. To solve the problems of missing groundtruth and the requirement of high efficiency when updating these visual prompts at test time, we introduce a local-global self-supervised learning strategy by enforcing the local and global consistency constraints on predictions. The local consistency constraints are conducted on intermediate and previously local fused results, enabling the model to be trained with high-quality pseudo groundtruth signals; the global consistency constraints are operated on sparse keyframes spanning long distances rather than per frame, allowing the model to learn from a consistent prediction over a long trajectory in an efficient way. Our experiments demonstrate that Online3R outperforms previous state-of-the-art methods on various benchmarks. Project page: https://shunkaizhou.github.io/online3r-1.0/
Problem

Research questions and friction points this paper is trying to address.

sequential reconstruction
online learning
geometry consistency
self-supervised learning
visual prompts
Innovation

Methods, ideas, or system contributions that make the work stand out.

online learning
geometry foundation model
visual prompts
self-supervised learning
sequential reconstruction
πŸ”Ž Similar Papers
No similar papers found.
S
Shunkai Zhou
School of Intelligence Science and Technology, Peking University; State Key Laboratory of General Artificial Intelligence
Zike Yan
Zike Yan
PostDoc, Tsinghua University; PhD, Peking University
3D VisionRoboticsContinual Learning
F
Fei Xue
NVIDIA
D
Dong Wu
School of Intelligence Science and Technology, Peking University; State Key Laboratory of General Artificial Intelligence
Y
Yuchen Deng
College of Computer and Information Science, Southwest University
Hongbin Zha
Hongbin Zha
Peking University
computer visionrobot vision