SALAD-Pan: Sensor-Agnostic Latent Adaptive Diffusion for Pan-Sharpening

πŸ“… 2026-02-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limitations of existing diffusion models for pansharpening, which operate in pixel space, require sensor-specific training, and suffer from high latency and sensor dependency. The authors propose the first sensor-agnostic latent-space diffusion framework, leveraging a single-channel variational autoencoder to compress multispectral images and integrating physical spectral priors with a bidirectional interaction control architecture to enable efficient and accurate pansharpening in the latent space. A novel lightweight cross-spectral attention mechanism is introduced to significantly enhance spectral fidelity and inference efficiency. Experiments demonstrate that the method outperforms current diffusion-based approaches on GaoFen-2, QuickBird, and WorldView-3 datasets, achieving 2–3Γ— faster inference and, for the first time, enabling zero-shot cross-sensor pansharpening.

Technology Category

Application Category

πŸ“ Abstract
Recently, diffusion models bring novel insights for Pan-sharpening and notably boost fusion precision. However, most existing models perform diffusion in the pixel space and train distinct models for different multispectral (MS) imagery, suffering from high latency and sensor-specific limitations. In this paper, we present SALAD-Pan, a sensor-agnostic latent space diffusion method for efficient pansharpening. Specifically, SALAD-Pan trains a band-wise single-channel VAE to encode high-resolution multispectral (HRMS) into compact latent representations, supporting MS images with various channel counts and establishing a basis for acceleration. Then spectral physical properties, along with PAN and MS images, are injected into the diffusion backbone through unidirectional and bidirectional interactive control structures respectively, achieving high-precision fusion in the diffusion process. Finally, a lightweight cross-spectral attention module is added to the central layer of diffusion model, reinforcing spectral connections to boost spectral consistency and further elevate fusion precision. Experimental results on GaoFen-2, QuickBird, and WorldView-3 demonstrate that SALAD-Pan outperforms state-of-the-art diffusion-based methods across all three datasets, attains a 2-3x inference speedup, and exhibits robust zero-shot (cross-sensor) capability.
Problem

Research questions and friction points this paper is trying to address.

pan-sharpening
diffusion models
sensor-agnostic
latent space
multispectral imagery
Innovation

Methods, ideas, or system contributions that make the work stand out.

latent diffusion
sensor-agnostic
pan-sharpening
cross-spectral attention
zero-shot generalization
πŸ”Ž Similar Papers
No similar papers found.
Junjie Li
Junjie Li
University Of Science And Technology Of China
Few-shot learningDomain adaptationImage inpainting
C
Congyang Ou
School of Cyberspace Security, Northwestern Polytechnical University, Xi’an, Shanxi, China
Haokui Zhang
Haokui Zhang
Northwestern Polytechnical University
Approximate nearest neighbor searchneural architecture searchdepth estimationHSI classificaion
G
Guoting Wei
School of Cyberspace Security, Northwestern Polytechnical University, Xi’an, Shanxi, China
Shengqin Jiang
Shengqin Jiang
Nanjing University of Information Science and Technology
Y
Ying Li
School of Cyberspace Security, Northwestern Polytechnical University, Xi’an, Shanxi, China
Chunhua Shen
Chunhua Shen
Zhejiang University
Computer VisionMachine Learning