🤖 AI Summary
To address challenges in modeling discrete state transitions, weak cross-observation information fusion, and inability to synthesize novel states in dynamic 3D scenes, this paper proposes a Discrete-State Recursive Gaussian Fusion framework. Our method achieves geometric alignment across states via semantic correspondence matching, integrates SE(3) motion refinement in Lie algebra with visibility-aware voxelized fusion, enabling history-preserving incremental updates and novel-state generation without additional scanning. A replay supervision mechanism ensures photometric consistency. Evaluated on multi-source, long-term datasets, our approach significantly improves reconstruction fidelity and interactive response latency, supports object-level manipulation, and enables high-quality novel view synthesis. To the best of our knowledge, this is the first method achieving high-accuracy, editable, and real-time evolving 3D interactive modeling for discretely evolving scenes.
📝 Abstract
Recent advances in 3D scene representations have enabled high-fidelity novel view synthesis, yet adapting to discrete scene changes and constructing interactive 3D environments remain open challenges in vision and robotics. Existing approaches focus solely on updating a single scene without supporting novel-state synthesis. Others rely on diffusion-based object-background decoupling that works on one state at a time and cannot fuse information across multiple observations. To address these limitations, we introduce RecurGS, a recurrent fusion framework that incrementally integrates discrete Gaussian scene states into a single evolving representation capable of interaction. RecurGS detects object-level changes across consecutive states, aligns their geometric motion using semantic correspondence and Lie-algebra based SE(3) refinement, and performs recurrent updates that preserve historical structures through replay supervision. A voxelized, visibility-aware fusion module selectively incorporates newly observed regions while keeping stable areas fixed, mitigating catastrophic forgetting and enabling efficient long-horizon updates. RecurGS supports object-level manipulation, synthesizes novel scene states without requiring additional scans, and maintains photorealistic fidelity across evolving environments. Extensive experiments across synthetic and real-world datasets demonstrate that our framework delivers high-quality reconstructions with substantially improved update efficiency, providing a scalable step toward continuously interactive Gaussian worlds.