🤖 AI Summary
This paper addresses instance-level image retrieval—precisely localizing images containing the same object despite significant variations in scale, pose, and appearance. We propose Patchify, a fine-tuning-free framework that partitions database images into structured local patches and performs cross-granularity matching between global query features and patch-level features, enabling spatially interpretable retrieval. We introduce LocScore, a novel localization-aware evaluation metric, and reveal the critical role of information preservation during feature compression; to this end, we integrate Product Quantization for efficient compression. Experiments demonstrate that Patchify consistently outperforms global-feature baselines across multiple benchmarks and backbone architectures, significantly improving re-ranking accuracy. The method supports real-time retrieval over databases of up to ten million images, achieving a favorable trade-off among high accuracy, spatial interpretability, and strong scalability.
📝 Abstract
Instance-level image retrieval aims to find images containing the same object as a given query, despite variations in size, position, or appearance. To address this challenging task, we propose Patchify, a simple yet effective patch-wise retrieval framework that offers high performance, scalability, and interpretability without requiring fine-tuning. Patchify divides each database image into a small number of structured patches and performs retrieval by comparing these local features with a global query descriptor, enabling accurate and spatially grounded matching. To assess not just retrieval accuracy but also spatial correctness, we introduce LocScore, a localization-aware metric that quantifies whether the retrieved region aligns with the target object. This makes LocScore a valuable diagnostic tool for understanding and improving retrieval behavior. We conduct extensive experiments across multiple benchmarks, backbones, and region selection strategies, showing that Patchify outperforms global methods and complements state-of-the-art reranking pipelines. Furthermore, we apply Product Quantization for efficient large-scale retrieval and highlight the importance of using informative features during compression, which significantly boosts performance. Project website: https://wons20k.github.io/PatchwiseRetrieval/