Evaluation of Visual Place Recognition Methods for Image Pair Retrieval in 3D Vision and Robotics

📅 2026-03-14

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work reframes visual place recognition (VPR) as an image-pair retrieval task to better support downstream applications such as scene registration, SLAM, and structure-from-motion. The study systematically evaluates prominent VPR methods—including NetVLAD, CosPlace, EigenPlaces, MixVPR, AnyLoc, SALAD, and MegaLoc—on three cross-domain 3D datasets: Tanks and Temples, ScanNet-GS, and KITTI. For the first time, VPR is explicitly modeled as a front-end for image-pair retrieval, revealing the domain dependency of existing approaches under challenges like perceptual aliasing and sequence incompleteness. Experimental results demonstrate that modern global descriptors can serve as plug-and-play, efficient retrieval modules, offering practical guidance for robust 3D mapping and registration pipelines.

Technology Category

Application Category

📝 Abstract

Visual Place Recognition (VPR) is a core component in computer vision, typically formulated as an image retrieval task for localization, mapping, and navigation. In this work, we instead study VPR as an image pair retrieval front-end for registration pipelines, where the goal is to find top-matching image pairs between two disjoint image sets for downstream tasks such as scene registration, SLAM, and Structure-from-Motion. We comparatively evaluate state-of-the-art VPR families - NetVLAD-style baselines, classification-based global descriptors (CosPlace, EigenPlaces), feature-mixing (MixVPR), and foundation-model-driven methods (AnyLoc, SALAD, MegaLoc) - on three challenging datasets: object-centric outdoor scenes (Tanks and Temples), indoor RGB-D scans (ScanNet-GS), and autonomous-driving sequences (KITTI). We show that modern global descriptor approaches are increasingly suitable as off-the-shelf image pair retrieval modules in challenging scenarios including perceptual aliasing and incomplete sequences, while exhibiting clear, domain-dependent strengths and weaknesses that are critical when choosing VPR components for robust mapping and registration.

Problem

Research questions and friction points this paper is trying to address.

Visual Place Recognition

Image Pair Retrieval

3D Vision

Robotics

Scene Registration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Visual Place Recognition

Image Pair Retrieval

Global Descriptors