4RC: 4D Reconstruction via Conditional Querying Anytime and Anywhere

πŸ“… 2026-02-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenges of geometric-motion coupling and sparse or constrained outputs in monocular video-based 4D reconstruction by proposing 4RC, a unified feedforward framework. 4RC introduces a novel β€œencode once, query anywhere in space-time” paradigm that decouples 4D attributes into a static base geometry and time-varying relative motion. Leveraging a Transformer-based spatiotemporal encoder and a conditional query decoder, the method enables end-to-end learning of dense 4D representations. It supports high-fidelity querying of geometry and motion at arbitrary frames and continuous time instants, achieving state-of-the-art performance across multiple 4D reconstruction benchmarks, significantly outperforming both existing and concurrent approaches.

Technology Category

Application Category

πŸ“ Abstract
We present 4RC, a unified feed-forward framework for 4D reconstruction from monocular videos. Unlike existing approaches that typically decouple motion from geometry or produce limited 4D attributes such as sparse trajectories or two-view scene flow, 4RC learns a holistic 4D representation that jointly captures dense scene geometry and motion dynamics. At its core, 4RC introduces a novel encode-once, query-anywhere and anytime paradigm: a transformer backbone encodes the entire video into a compact spatio-temporal latent space, from which a conditional decoder can efficiently query 3D geometry and motion for any query frame at any target timestamp. To facilitate learning, we represent per-view 4D attributes in a minimally factorized form by decomposing them into base geometry and time-dependent relative motion. Extensive experiments demonstrate that 4RC outperforms prior and concurrent methods across a wide range of 4D reconstruction tasks.
Problem

Research questions and friction points this paper is trying to address.

4D reconstruction
monocular video
scene geometry
motion dynamics
spatio-temporal representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

4D reconstruction
conditional querying
spatio-temporal latent space
holistic 4D representation
monocular video