S2WMamba: A Spectral-Spatial Wavelet Mamba for Pansharpening

📅 2025-12-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the inherent coupling between spatial detail preservation and spectral fidelity in pansharpening—fusing high-resolution panchromatic (PAN) and low-resolution multispectral (LRMS) images. To decouple these competing objectives, we propose a spectral–spatial dual-branch fusion framework. Methodologically, we introduce 2D Haar wavelet transform for PAN-based spatial detail extraction and channel-wise 1D Haar decomposition to model spectral frequency characteristics; we further incorporate the Mamba architecture to capture long-range dependencies and enable lightweight cross-modal interaction. A multi-scale dynamic gating mechanism is designed to adaptively fuse frequency-domain features. Evaluated on WV3, GF-2, and QB datasets, our method achieves up to 0.23 dB PSNR gain over state-of-the-art methods; on full-resolution WV3, it attains an HQNR of 0.956—surpassing FusionMamba and others. Results validate the effectiveness of synergistic spectral-domain decoupling and state-space modeling.

Technology Category

Application Category

📝 Abstract
Pansharpening fuses a high-resolution PAN image with a low-resolution multispectral (LRMS) image to produce an HRMS image. A key difficulty is that jointly processing PAN and MS often entangles spatial detail with spectral fidelity. We propose S2WMamba, which explicitly disentangles frequency information and then performs lightweight cross-modal interaction. Concretely, a 2D Haar DWT is applied to PAN to localize spatial edges and textures, while a channel-wise 1D Haar DWT treats each pixel's spectrum as a 1D signal to separate low/high-frequency components and limit spectral distortion. The resulting Spectral branch injects wavelet-extracted spatial details into MS features, and the Spatial branch refines PAN features using spectra from the 1D pyramid; the two branches exchange information through Mamba-based cross-modulation that models long-range dependencies with linear complexity. A multi-scale dynamic gate (multiplicative + additive) then adaptively fuses branch outputs.On WV3, GF2, and QB, S2WMamba matches or surpasses recent strong baselines (FusionMamba, CANNet, U2Net, ARConv), improving PSNR by up to 0.23 dB and reaching HQNR 0.956 on full-resolution WV3. Ablations justify the choice of 2D/1D DWT placement, parallel dual branches, and the fusion gate. Our code is available at https://github.com/KagUYa66/S2WMamba.
Problem

Research questions and friction points this paper is trying to address.

Fuses high-resolution PAN with low-resolution MS images
Disentangles spatial detail from spectral fidelity in pansharpening
Models long-range dependencies with linear complexity for fusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses 2D Haar DWT on PAN to extract spatial edges and textures
Applies channel-wise 1D Haar DWT to separate spectral frequency components
Employs Mamba-based cross-modulation for lightweight long-range interaction
🔎 Similar Papers
No similar papers found.
H
Haoyu Zhang
University of Electronic Science and Technology of China
J
Junhan Luo
University of Electronic Science and Technology of China
Y
Yugang Cao
University of Electronic Science and Technology of China
Siran Peng
Siran Peng
CASIA
Computer VisionImage FusionDeepfake Detection
J
Jie Huang
University of Electronic Science and Technology of China
L
Liangjian-Deng
University of Electronic Science and Technology of China