🤖 AI Summary
This work addresses trajectory drift and geometric inconsistency in monocular online 3D reconstruction caused by abrupt camera motion changes. The authors propose a pose-adaptive streaming reconstruction framework that dynamically evaluates frame importance through a motion-aware mechanism, combining pose variation and high-frequency image cues to adaptively regulate state updates: frames exhibiting high geometric novelty drive reconstruction updates, while those with minor viewpoint changes preserve historical context. The method further incorporates relative pose constraints, acceleration regularization, and high-frequency jitter suppression, alongside a trajectory consistency training objective and a lightweight online stabilization module. Experiments demonstrate significant improvements in trajectory accuracy, depth estimation, and point cloud quality on long sequences across multiple benchmarks, while maintaining competitive performance on short sequences—all achieved with low memory overhead for high-quality streaming reconstruction.
📝 Abstract
Online monocular 3D reconstruction enables dense scene recovery from streaming video but remains fundamentally limited by the stability-adaptation dilemma: the reconstruction model must rapidly incorporate novel viewpoints while preserving previously accumulated scene structure. Existing streaming approaches rely on uniform or attention-based update mechanisms that often fail to account for abrupt viewpoint transitions, leading to trajectory drift and geometric inconsistencies over long sequences. We introduce PAS3R, a pose-adaptive streaming reconstruction framework that dynamically modulates state updates according to camera motion and scene structure. Our key insight is that frames contributing significant geometric novelty should exert stronger influence on the reconstruction state, while frames with minor viewpoint variation should prioritize preserving historical context. PAS3R operationalizes this principle through a motion-aware update mechanism that jointly leverages inter-frame pose variation and image frequency cues to estimate frame importance. To further stabilize long-horizon reconstruction, we introduce trajectory-consistent training objectives that incorporate relative pose constraints and acceleration regularization. A lightweight online stabilization module further suppresses high-frequency trajectory jitter and geometric artifacts without increasing memory consumption. Extensive experiments across multiple benchmarks demonstrate that PAS3R significantly improves trajectory accuracy, depth estimation, and point cloud reconstruction quality in long video sequences while maintaining competitive performance on shorter sequences.