Jamais Vu: Exposing the Generalization Gap in Supervised Semantic Correspondence

📅 2025-06-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Supervised semantic correspondence methods suffer from limited generalization under sparse keypoint annotations, effectively degenerating into keypoint detectors and failing to model dense semantic correspondences. This work identifies their fundamental generalization bottleneck and proposes mapping 2D keypoints onto an unsupervised, canonical 3D geometric manifold—enabling continuous-space correspondence modeling. We introduce SPair-U, the first benchmark explicitly designed to quantify cross-keypoint generalization capability. Furthermore, we propose a monocular depth-driven dense correspondence learning framework that requires no camera calibration. Experiments demonstrate that our method significantly outperforms supervised baselines on unseen keypoints. Remarkably, our unsupervised approach even surpasses supervised methods in cross-dataset generalization, revealing an intrinsic generalization gap inherent to the supervised paradigm.

Technology Category

Application Category

📝 Abstract
Semantic correspondence (SC) aims to establish semantically meaningful matches across different instances of an object category. We illustrate how recent supervised SC methods remain limited in their ability to generalize beyond sparsely annotated training keypoints, effectively acting as keypoint detectors. To address this, we propose a novel approach for learning dense correspondences by lifting 2D keypoints into a canonical 3D space using monocular depth estimation. Our method constructs a continuous canonical manifold that captures object geometry without requiring explicit 3D supervision or camera annotations. Additionally, we introduce SPair-U, an extension of SPair-71k with novel keypoint annotations, to better assess generalization. Experiments not only demonstrate that our model significantly outperforms supervised baselines on unseen keypoints, highlighting its effectiveness in learning robust correspondences, but that unsupervised baselines outperform supervised counterparts when generalized across different datasets.
Problem

Research questions and friction points this paper is trying to address.

Supervised SC methods generalize poorly beyond annotated keypoints
Proposes learning dense correspondences via 3D canonical space lifting
Introduces SPair-U dataset to better evaluate generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lifts 2D keypoints into 3D space
Uses monocular depth estimation
Constructs continuous canonical manifold
🔎 Similar Papers
No similar papers found.