🤖 AI Summary
To address the performance degradation in zero-shot image matching caused by the absence of geometric priors, this paper introduces 3D Gaussian Splatting (3DGS) into semi-dense correspondence data generation for the first time, proposing a geometry-faithful synthesis framework. Our method enhances 3DGS reconstruction accuracy via depth optimization and geometric refinement, and enforces robust cross-view correspondence learning through an epipolar-constrained 2D–3D representation alignment mechanism. Furthermore, we design a self-supervised Gaussian attribute learning strategy to improve the geometric consistency of generated pseudo-labels. Experiments demonstrate that the synthesized correspondences reduce epipolar error by 40×, and the zero-shot matching performance achieves up to a 17.7% improvement across multiple public benchmarks—significantly outperforming state-of-the-art methods.
📝 Abstract
Learning-based image matching critically depends on large-scale, diverse, and geometrically accurate training data. 3D Gaussian Splatting (3DGS) enables photorealistic novel-view synthesis and thus is attractive for data generation. However, its geometric inaccuracies and biased depth rendering currently prevent robust correspondence labeling. To address this, we introduce MatchGS, the first framework designed to systematically correct and leverage 3DGS for robust, zero-shot image matching. Our approach is twofold: (1) a geometrically-faithful data generation pipeline that refines 3DGS geometry to produce highly precise correspondence labels, enabling the synthesis of a vast and diverse range of viewpoints without compromising rendering fidelity; and (2) a 2D-3D representation alignment strategy that infuses 3DGS' explicit 3D knowledge into the 2D matcher, guiding 2D semi-dense matchers to learn viewpoint-invariant 3D representations. Our generated ground-truth correspondences reduce the epipolar error by up to 40 times compared to existing datasets, enable supervision under extreme viewpoint changes, and provide self-supervisory signals through Gaussian attributes. Consequently, state-of-the-art matchers trained solely on our data achieve significant zero-shot performance gains on public benchmarks, with improvements of up to 17.7%. Our work demonstrates that with proper geometric refinement, 3DGS can serve as a scalable, high-fidelity, and structurally-rich data source, paving the way for a new generation of robust zero-shot image matchers.