ELViS: Efficient Visual Similarity from Local Descriptors that Generalizes Across Domains

📅 2026-03-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of generalizing image retrieval models to unseen domains when large-scale instance-level annotations are unavailable. The authors propose a novel cross-domain similarity computation method based on local descriptor correspondences, which innovatively operates in similarity space rather than representation space. By integrating optimal transport with a data-dependent gain mechanism to suppress spurious matches and aggregating image-level similarity through strong correspondence voting, the approach achieves both efficiency and interpretability. Evaluated on a new benchmark comprising eight cross-domain datasets, the method significantly outperforms existing techniques, demonstrating superior average performance in out-of-domain scenarios while substantially reducing computational overhead.
📝 Abstract
Large-scale instance-level training data is scarce, so models are typically trained on domain-specific datasets. Yet in real-world retrieval, they must handle diverse domains, making generalization to unseen data critical. We introduce ELViS, an image-to-image similarity model that generalizes effectively to unseen domains. Unlike conventional approaches, our model operates in similarity space rather than representation space, promoting cross-domain transfer. It leverages local descriptor correspondences, refines their similarities through an optimal transport step with data-dependent gains that suppress uninformative descriptors, and aggregates strong correspondences via a voting process into an image-level similarity. This design injects strong inductive biases, yielding a simple, efficient, and interpretable model. To assess generalization, we compile a benchmark of eight datasets spanning landmarks, artworks, products, and multi-domain collections, and evaluate ELViS as a re-ranking method. Our experiments show that ELViS outperforms competing methods by a large margin in out-of-domain scenarios and on average, while requiring only a fraction of their computational cost. Code available at: https://github.com/pavelsuma/ELViS/
Problem

Research questions and friction points this paper is trying to address.

visual similarity
domain generalization
image retrieval
local descriptors
cross-domain transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

visual similarity
local descriptors
optimal transport
cross-domain generalization
inductive bias
🔎 Similar Papers
No similar papers found.