🤖 AI Summary
Online scene change detection (SCD) faces challenges including absence of pose priors, lack of ground-truth annotations, real-time inference requirements, and cross-view consistency. This paper proposes the first real-time, fully unsupervised online SCD method that requires neither pose initialization nor manual labeling. It constructs a dynamic scene representation using 3D Gaussian Splatting, and introduces a self-supervised multi-cue fusion loss jointly optimizing geometric, appearance, and motion consistency. To ensure robust cross-view change reasoning, we integrate a lightweight PnP-based pose estimator and a change-guided incremental update mechanism for 3D Gaussian primitives. Evaluated on complex real-world datasets, our method outperforms both existing online and offline approaches, achieving state-of-the-art accuracy while maintaining >10 FPS inference speed—marking the first solution enabling high-precision, low-latency, completely unsupervised online SCD.
📝 Abstract
Online Scene Change Detection (SCD) is an extremely challenging problem that requires an agent to detect relevant changes on the fly while observing the scene from unconstrained viewpoints. Existing online SCD methods are significantly less accurate than offline approaches. We present the first online SCD approach that is pose-agnostic, label-free, and ensures multi-view consistency, while operating at over 10 FPS and achieving new state-of-the-art performance, surpassing even the best offline approaches. Our method introduces a new self-supervised fusion loss to infer scene changes from multiple cues and observations, PnP-based fast pose estimation against the reference scene, and a fast change-guided update strategy for the 3D Gaussian Splatting scene representation. Extensive experiments on complex real-world datasets demonstrate that our approach outperforms both online and offline baselines.