Edge-Cloud Collaborative Reconstruction via Structure-Aware Latent Diffusion for Downstream Remote Sensing Perception

📅 2026-04-28
📈 Citations: 0
Influential: 0
📄 PDF

career value

241K/year
🤖 AI Summary
This study addresses the severe degradation of high-frequency structural information in high-resolution remote sensing images caused by aggressive compression during satellite-to-ground transmission, which significantly impairs downstream perception tasks. To mitigate this issue, the authors propose a novel edge-cloud collaborative structure-aware latent diffusion super-resolution framework. At the edge, images are decomposed into a low-frequency payload and a lightweight soft structural prior. In the cloud, a Structure-Gated Large-Kernel Convolution (SGLK) module and a Semantic Guidance Engine (SGE) synergistically integrate the prior during the diffusion process to recover fine details while suppressing hallucinations. This work pioneers an asymmetric edge-cloud architecture that achieves an optimal balance between perceptual quality and structural fidelity under extreme compression. Experiments demonstrate substantial improvements in LPIPS scores on the MSCM and UCMerced datasets, along with enhanced performance in scene classification and small object detection.
📝 Abstract
The exponential surge in high-resolution remote sensing data faces a severe bottleneck in satellite-to-ground transmission. Limited downlink bandwidth forces the use of extreme high-ratio compression, which irreversibly destroys high-frequency structural details essential for downstream machine perception tasks like object detection. While current super-resolution techniques attempt to recover these details, regression-based methods often yield over-smoothed textures, and generative diffusion models frequently introduce structural hallucinations that mislead detection systems. To address this trade-off, we propose the Structure-Aware Latent Diffusion (SALD) framework, an asymmetric edge-cloud collaborative SR system. At the resource-constrained edge, the system decouples imagery into a highly compressed low-frequency payload and a lightweight soft structural prior. Transmitting this decoupled representation minimizes bandwidth consumption. On the powerful cloud side, we introduce a Structure-Gated Large Kernel (SGLK) module and a Semantic-Guidance Engine (SGE) within the diffusion backbone. These modules leverage the transmitted structural priors to gate large-kernel convolutions, effectively capturing long-range dependencies inherent in aerial scenes while actively suppressing generative hallucinations. Extensive experiments on both the MSCM and UCMerced datasets demonstrate that, even under extreme bandwidth constraints, SALD achieves superior perceptual quality (LPIPS) and significantly enhances downstream performance in both scene classification and small-target detection.
Problem

Research questions and friction points this paper is trying to address.

remote sensing
bandwidth bottleneck
structural detail loss
super-resolution
downstream perception
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structure-Aware Latent Diffusion
Edge-Cloud Collaboration
Structural Prior
Hallucination Suppression
Remote Sensing Super-Resolution
🔎 Similar Papers
No similar papers found.