🤖 AI Summary
This work addresses the limitations of existing Gaussian splatting–based change detection methods, which rely on comparisons of rendered pixels or features and thus struggle to ensure multi-view consistency or distinguish between types of changes. The authors propose the first approach that directly compares attributes—such as position, anisotropic covariance, and color—in the native Gaussian primitive space. By introducing geometric and photometric drift models along with an observability term to account for uncertainties in primitive representations, the method achieves multi-view consistent change detection without requiring additional optimization. Furthermore, it enables unsupervised differentiation between structural and surface-level changes. Evaluated on real-world benchmarks, the approach improves mean Intersection over Union (mIoU) by approximately 17% over the current state of the art.
📝 Abstract
Scene change detection methods built on Gaussian splatting universally follow a render-then-compare paradigm: the pre-change scene is rendered into 2D and compared against post-change images via pixel or feature residuals. This change detection problem with Gaussian Splatting has been treated as a question about pixels; we treat it as a question about primitives. We provide direct evidence that native primitive attributes alone -- position, anisotropic covariance, and color -- carry sufficient signal for scene change detection. What makes primitive-space comparison hard is the under-constrained nature of Gaussian splatting representation: independent optimizations yield primitive solutions whose count, positions, shapes, and colors differ even where nothing has changed. We address this challenge with anisotropic models of geometric and photometric drift, complemented by a per-primitive observability term that reflects the extent to which each Gaussian is constrained by the camera geometry. Operating directly on primitives gives our method, GD-DIFF, two properties that distinguish it from render-then-compare methods. First, change maps are multi-view consistent by construction, where prior work had to learn this through an additional optimization objective. Second, geometric and appearance changes are scored separately, identifying not just where but what kind of change occurred, distinguishing structural changes (e.g., an added object) from surface-level ones (e.g., a color change) without supervision or external model dependencies. On real-world benchmarks, GS-DIFF surpasses the prior state-of-the-art approach by approximatelt 17% in mean Intersection over Union.