OnlineSplatter: Pose-Free Online 3D Reconstruction for Free-Moving Objects

📅 2025-10-23

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the challenging problem of online 3D reconstruction of freely moving objects from monocular video—under conditions of unknown camera poses, no depth priors, arbitrary object motion, and scarce reliable geometric cues—while requiring high-fidelity, object-centric, real-time modeling. We propose an online Gaussian reconstruction framework based on a feedforward network, whose core innovation is a dual-key memory module: it jointly leverages implicit appearance-geometry keys and explicit directional keys to enable robust state aggregation and spatially guided memory retrieval. The method incorporates temporal feature aggregation, spatially guided sparse readout, and an efficient Gaussian sparsification mechanism to maintain a dynamically updated, dense Gaussian primitive field. Evaluated on real-world datasets, our approach significantly outperforms existing pose-free methods; reconstruction quality improves steadily with increasing observation frames, while memory footprint and computational cost remain constant.

Technology Category

Application Category

📝 Abstract

Free-moving object reconstruction from monocular video remains challenging, particularly without reliable pose or depth cues and under arbitrary object motion. We introduce OnlineSplatter, a novel online feed-forward framework generating high-quality, object-centric 3D Gaussians directly from RGB frames without requiring camera pose, depth priors, or bundle optimization. Our approach anchors reconstruction using the first frame and progressively refines the object representation through a dense Gaussian primitive field, maintaining constant computational cost regardless of video sequence length. Our core contribution is a dual-key memory module combining latent appearance-geometry keys with explicit directional keys, robustly fusing current frame features with temporally aggregated object states. This design enables effective handling of free-moving objects via spatial-guided memory readout and an efficient sparsification mechanism, ensuring comprehensive yet compact object coverage. Evaluations on real-world datasets demonstrate that OnlineSplatter significantly outperforms state-of-the-art pose-free reconstruction baselines, consistently improving with more observations while maintaining constant memory and runtime.

Problem

Research questions and friction points this paper is trying to address.

Reconstructs free-moving objects from monocular video

Operates without camera pose, depth priors or optimization

Maintains constant computational cost regardless of sequence length

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online feed-forward framework without camera pose

Dual-key memory module fusing appearance and geometry

Constant computational cost via progressive Gaussian refinement

🔎 Similar Papers

DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction