FS-I2P:A Hierarchical Focus-Sweep Registration Network with Dynamically Allocated Depth

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses the challenges of scale ambiguity and erroneous correspondences in image-to-point-cloud registration caused by viewpoint variations, cross-modal discrepancies, and repetitive textures. To this end, the authors propose a “focus-and-scan” registration paradigm that leverages a state-space model to construct a hierarchical interaction module. Inspired by human visual perception, the method introduces a dynamic layer allocation strategy that adaptively determines the optimal iteration depth, thereby strengthening multi-scale cross-modal feature associations while mitigating attention drift and scale inconsistency. Evaluated on the RGB-D Scenes V2 and 7-Scenes benchmarks, the proposed approach achieves state-of-the-art performance, demonstrating significant improvements in both registration accuracy and robustness.

📝 Abstract

Image-to-point cloud registration is often challenged by viewpoint changes, cross-modal discrepancies, and repetitive textures, which induce scale ambiguity and consequently lead to erroneous correspondences. Recent detection-free methods alleviate this issue by leveraging multi-scale features and transformer-based interactions. However, they still suffer from attention drift across layers and intra-scale inconsistencies, hindering precise registration. Inspired by human behavior, we propose a ``Focus--Sweep'' paradigm and develop a Hierarchical Focus--Sweep Interaction Module within an SSM-based framework to enhance multi-level cross-modal feature association. In addition, we introduce a Dynamic Layer Allocation Strategy that adaptively determines the iteration depth to better exploit geometric constraints and improve matching robustness. Extensive experiments and ablations on two benchmarks, RGB-D Scenes V2 and 7-Scenes, demonstrate that our approach achieves state-of-the-art performance.

Problem

Research questions and friction points this paper is trying to address.

image-to-point cloud registration

scale ambiguity

cross-modal discrepancy

attention drift

intra-scale inconsistency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Focus-Sweep

Hierarchical Interaction

Dynamic Layer Allocation