A Diff-Attention Aware State Space Fusion Model for Remote Sensing Classification

📅 2025-04-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the substantial semantic gap between multispectral (MS) and panchromatic (PAN) remote sensing images, confusion between shared and modality-specific features, and severe fusion redundancy in pansharpening, this paper proposes a disparity-aware state-space fusion model. Our method comprises three key components: (1) a Cross-Modality Disparity Attention (CMDA) module that explicitly models semantic discrepancies between MS and PAN modalities and disentangles shared versus modality-specific features; (2) an Attention-Aware Linear Fusion (AALF) module enabling pixel-wise adaptive weighted fusion; and (3) a Spatial-Preserving Visual Mamba (SPVM) architecture to enhance long-range spatial modeling. Evaluated on standard remote sensing benchmarks, our approach consistently outperforms state-of-the-art pansharpening methods, achieving significant improvements in downstream classification accuracy. The source code is publicly available.

Technology Category

Application Category

📝 Abstract
Multispectral (MS) and panchromatic (PAN) images describe the same land surface, so these images not only have their own advantages, but also have a lot of similar information. In order to separate these similar information and their respective advantages, reduce the feature redundancy in the fusion stage. This paper introduces a diff-attention aware state space fusion model (DAS2F-Model) for multimodal remote sensing image classification. Based on the selective state space model, a cross-modal diff-attention module (CMDA-Module) is designed to extract and separate the common features and their respective dominant features of MS and PAN images. Among this, space preserving visual mamba (SPVM) retains image spatial features and captures local features by optimizing visual mamba's input reasonably. Considering that features in the fusion stage will have large semantic differences after feature separation and simple fusion operations struggle to effectively integrate these significantly different features, an attention-aware linear fusion module (AALF-Module) is proposed. It performs pixel-wise linear fusion by calculating influence coefficients. This mechanism can fuse features with large semantic differences while keeping the feature size unchanged. Empirical evaluations indicate that the presented method achieves better results than alternative approaches. The relevant code can be found at:https://github.com/AVKSKVL/DAS-F-Model
Problem

Research questions and friction points this paper is trying to address.

Separates similar and distinct features in MS and PAN images
Reduces feature redundancy in multimodal image fusion
Improves classification of remote sensing images via attention-aware fusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-modal diff-attention module for feature separation
Space preserving visual mamba retains spatial features
Attention-aware linear fusion integrates semantic differences
🔎 Similar Papers
No similar papers found.
Wenping Ma
Wenping Ma
Xidian University
Artificial intelligence
B
Boyou Xue
Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, International Research Center for Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi’an 710071, China
Mengru Ma
Mengru Ma
xidian university
Fusion Classification,Remote Sensing Intelligent Interpretation
C
Chuang Chen
Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, International Research Center for Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi’an 710071, China
H
Hekai Zhang
Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, International Research Center for Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi’an 710071, China
H
Hao Zhu
Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, International Research Center for Intelligent Perception and Computation, School of Artificial Intelligence, Xidian University, Xi’an 710071, China