Relational Representation Learning Network for Cross-Spectral Image Patch Matching

πŸ“… 2024-03-18
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing cross-spectral patch matching methods overemphasize inter-patch relationships while neglecting intrinsic feature modeling of individual patches. To address this, we propose a unified relational representation learning frameworkβ€”the first to jointly model both intrinsic patch features and inter-patch relationships. Specifically, we design a joint learning mechanism integrating intrinsic feature auto-encoding representation with deep feature interaction; introduce a lightweight Multi-dimensional Global-to-Local Attention (MGLA) module to enhance multi-scale contextual awareness; and develop an Attention-driven Lightweight Feature Extraction (ALFE) network coupled with a Multi-Loss Post-Pruning (MLPP) optimization strategy. Evaluated on multiple public benchmarks, our framework achieves state-of-the-art performance, significantly improving matching accuracy without increasing model parameter count or inference latency.

Technology Category

Application Category

πŸ“ Abstract
Recently, feature relation learning has drawn widespread attention in cross-spectral image patch matching. However, existing related research focuses on extracting diverse relations between image patch features and ignores sufficient intrinsic feature representations of individual image patches. Therefore, we propose an innovative relational representation learning idea that simultaneously focuses on sufficiently mining the intrinsic features of individual image patches and the relations between image patch features. Based on this, we construct a Relational Representation Learning Network (RRL-Net). Specifically, we innovatively construct an autoencoder to fully characterize the individual intrinsic features, and introduce a feature interaction learning (FIL) module to extract deep-level feature relations. To further fully mine individual intrinsic features, a lightweight multi-dimensional global-to-local attention (MGLA) module is constructed to enhance the global feature extraction of individual image patches and capture local dependencies within global features. By combining the MGLA module, we further explore the feature extraction network and construct an attention-based lightweight feature extraction (ALFE) network. In addition, we propose a multi-loss post-pruning (MLPP) optimization strategy, which greatly promotes network optimization while avoiding increases in parameters and inference time. Extensive experiments demonstrate that our RRL-Net achieves state-of-the-art (SOTA) performance on multiple public datasets. Our code are available at https://github.com/YuChuang1205/RRL-Net.
Problem

Research questions and friction points this paper is trying to address.

Enhancing cross-spectral image patch matching accuracy
Mining intrinsic features and relations simultaneously
Optimizing network without parameter or time increase
Innovation

Methods, ideas, or system contributions that make the work stand out.

Autoencoder for intrinsic feature extraction
Feature interaction learning module
Lightweight multi-dimensional attention module
πŸ”Ž Similar Papers
No similar papers found.
C
Chuan Yu
Shenyang Institute of Automation, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Yunpeng Liu
Yunpeng Liu
Wuhan University of Technology
cement and concrete materials
J
Jinmiao Zhao
Shenyang Institute of Automation, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Dou Quan
Dou Quan
Xidian University
computer visiondeep learning
Z
Zelin Shi
Shenyang Institute of Automation, Chinese Academy of Sciences