GESS: Multi-cue Guided Local Feature Learning via Geometric and Semantic Synergy

๐Ÿ“… 2026-04-06
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the limitations of existing local feature methods that rely solely on appearance cues, resulting in unstable keypoints and poorly discriminative descriptors. To overcome this, the authors propose a multi-cue guided local feature learning framework that innovatively couples semantic segmentation with surface normal prediction and incorporates a depth-aware stability assessment. This leads to a Semanticโ€“Depth-Aware Keypoint selection mechanism (SDAK) and a Unified Three-Cue Fusion descriptor module (UTCF). Built upon a lightweight backbone and a joint prediction head, the proposed approach significantly improves local feature detection and matching performance across four benchmark datasets, demonstrating the effectiveness of jointly modeling semantic, geometric, and depth cues.
๐Ÿ“ Abstract
Robust local feature detection and description are foundational tasks in computer vision. Existing methods primarily rely on single appearance cues for modeling, leading to unstable keypoints and insufficient descriptor discriminability. In this paper, we propose a multi-cue guided local feature learning framework that leverages semantic and geometric cues to synergistically enhance detection robustness and descriptor discriminability. Specifically, we construct a joint semantic-normal prediction head and a depth stability prediction head atop a lightweight backbone. The former leverages a shared 3D vector field to deeply couple semantic and normal cues, thereby resolving optimization interference from heterogeneous inconsistencies. The latter quantifies the reliability of local regions from a geometric consistency perspective, providing deterministic guidance for robust keypoint selection. Based on these predictions, we introduce the Semantic-Depth Aware Keypoint (SDAK) mechanism for feature detection. By coupling semantic reliability with depth stability, SDAK reweights keypoint responses to suppress spurious features in unreliable regions. For descriptor construction, we design a Unified Triple-Cue Fusion (UTCF) module, which employs a semantic-scheduled gating mechanism to adaptively inject multi-attribute features, improving descriptor discriminability. Extensive experiments on four benchmarks validate the effectiveness of the proposed framework. The source code and pre-trained model will be available at: https://github.com/yiyscut/GESS.git.
Problem

Research questions and friction points this paper is trying to address.

local feature detection
descriptor discriminability
multi-cue learning
semantic cues
geometric consistency
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-cue fusion
semantic-geometric synergy
local feature learning
keypoint reliability
descriptor discriminability
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Yang Yi
College of Intelligence Science and Technology, National University of Defense Technology, China
Xieyuanli Chen
Xieyuanli Chen
Associate Professor, NUDT, China
RoboticsSLAMLocalizationLiDAR PerceptionRobot Learning
J
Jinpu Zhang
College of Intelligence Science and Technology, National University of Defense Technology, China
Hui Shen
Hui Shen
national university of defense technology
fMRIBrain Network
D
Dewen Hu
College of Intelligence Science and Technology, National University of Defense Technology, China