🤖 AI Summary
Existing online free-viewpoint video (FVV) reconstruction methods rely on point-wise modeling, neglecting motion priors, resulting in excessive GPU memory consumption and impractical real-time rendering. To address this, we propose a keypoint-driven online Gaussian splatting framework. First, motion keypoints are localized via view-space gradient difference to capture dynamic regions. Second, an adaptive motion propagation field is constructed, leveraging motion locality and consistency for efficient motion representation. Third, an error-aware keyframe reconstruction mechanism mitigates error accumulation, while a compact Gaussian streaming encoding scheme enables memory-efficient online processing. Compared to 3DGStream and the state-of-the-art QUEEN, our method reduces GPU memory usage by 159× and 14×, respectively, while maintaining high visual fidelity and real-time rendering performance.
📝 Abstract
3D Gaussian Splatting (3DGS) has emerged as a high-fidelity and efficient paradigm for online free-viewpoint video (FVV) reconstruction, offering viewers rapid responsiveness and immersive experiences. However, existing online methods face challenge in prohibitive storage requirements primarily due to point-wise modeling that fails to exploit the motion properties. To address this limitation, we propose a novel Compact Gaussian Streaming (ComGS) framework, leveraging the locality and consistency of motion in dynamic scene, that models object-consistent Gaussian point motion through keypoint-driven motion representation. By transmitting only the keypoint attributes, this framework provides a more storage-efficient solution. Specifically, we first identify a sparse set of motion-sensitive keypoints localized within motion regions using a viewspace gradient difference strategy. Equipped with these keypoints, we propose an adaptive motion-driven mechanism that predicts a spatial influence field for propagating keypoint motion to neighboring Gaussian points with similar motion. Moreover, ComGS adopts an error-aware correction strategy for key frame reconstruction that selectively refines erroneous regions and mitigates error accumulation without unnecessary overhead. Overall, ComGS achieves a remarkable storage reduction of over 159 X compared to 3DGStream and 14 X compared to the SOTA method QUEEN, while maintaining competitive visual fidelity and rendering speed. Our code will be released.