Fast 3D point clouds retrieval for Large-scale 3D Place Recognition

📅 2025-02-28

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

To address the low retrieval efficiency in large-scale 3D point cloud scenes, this paper proposes the first differentiable search index (DSI) method tailored for point clouds. Our approach maps point cloud descriptors end-to-end to compact 1D hash-like identifiers, enabling constant-time (O(1)) direct retrieval. We innovatively adapt the DSI framework—originally developed for text—to 3D point clouds by designing a Vision Transformer (ViT)-based joint positional-semantic encoder, which supports differentiable learning of identifiers and end-to-end optimization. Evaluated on standard benchmarks, our method achieves state-of-the-art recall and localization accuracy while significantly accelerating retrieval speed. It overcomes the computational bottlenecks inherent in conventional approximate nearest neighbor (ANN) search methods, establishing a new paradigm for real-time large-scale 3D scene recognition.

Technology Category

Application Category

📝 Abstract

Retrieval in 3D point clouds is a challenging task that consists in retrieving the most similar point clouds to a given query within a reference of 3D points. Current methods focus on comparing descriptors of point clouds in order to identify similar ones. Due to the complexity of this latter step, here we focus on the acceleration of the retrieval by adapting the Differentiable Search Index (DSI), a transformer-based approach initially designed for text information retrieval, for 3D point clouds retrieval. Our approach generates 1D identifiers based on the point descriptors, enabling direct retrieval in constant time. To adapt DSI to 3D data, we integrate Vision Transformers to map descriptors to these identifiers while incorporating positional and semantic encoding. The approach is evaluated for place recognition on a public benchmark comparing its retrieval capabilities against state-of-the-art methods, in terms of quality and speed of returned point clouds.

Problem

Research questions and friction points this paper is trying to address.

Accelerating 3D point cloud retrieval for large-scale place recognition

Adapting transformer-based DSI for efficient 3D data retrieval

Enhancing retrieval speed and quality using Vision Transformers

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts Differentiable Search Index for 3D retrieval

Uses Vision Transformers for descriptor mapping

Generates 1D identifiers for constant time retrieval

🔎 Similar Papers

CSCPR: Cross-Source-Context Indoor RGB-D Place Recognition