S3Former: Self-supervised High-resolution Transformer for Solar PV Profiling

📅 2024-05-07
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of detecting small-scale photovoltaic (PV) panels in aerial imagery and poor cross-scene generalization, this paper proposes a remote sensing image instance segmentation method based on MaskFormer. The approach integrates a self-supervised pre-trained backbone network with an instance-level query decoding mechanism, incorporating multi-scale feature fusion and mask attention modeling to significantly enhance robustness under complex weather conditions, diverse roof materials, and variable-resolution imagery. Evaluated on multiple public datasets, the method achieves a mean Average Precision (mAP) of 58.3%, outperforming state-of-the-art models by an average of 4.7% and reducing localization error by 32%. This work is the first to synergistically combine mask attention mechanisms with self-supervised pre-training for fine-grained PV panel identification, providing a reliable technical foundation for high-precision energy infrastructure mapping, power grid impact assessment, and energy policy formulation.

Technology Category

Application Category

📝 Abstract
As the impact of climate change escalates, the global necessity to transition to sustainable energy sources becomes increasingly evident. Renewable energies have emerged as a viable solution for users, with Photovoltaic energy being a favored choice for small installations due to its reliability and efficiency. Accurate mapping of PV installations is crucial for understanding the extension of its adoption and informing energy policy. To meet this need, we introduce S3Former, designed to segment solar panels from aerial imagery and provide size and location information critical for analyzing the impact of such installations on the grid. Solar panel identification is challenging due to factors such as varying weather conditions, roof characteristics, Ground Sampling Distance variations and lack of appropriate initialization weights for optimized training. To tackle these complexities, S3Former features a Masked Attention Mask Transformer incorporating a self-supervised learning pretrained backbone. Specifically, our model leverages low-level and high-level features extracted from the backbone and incorporates an instance query mechanism incorporated on the Transformer architecture to enhance the localization of solar PV installations. We introduce a self-supervised learning phase (pretext task) to improve the initialization weights on the backbone of S3Former. We evaluated S3Former using diverse datasets, demonstrate improvement state-of-the-art models.
Problem

Research questions and friction points this paper is trying to address.

Accurately segment solar panels from aerial imagery
Address challenges like weather variations and roof characteristics
Improve initialization weights via self-supervised learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised learning pretrained backbone
Masked Attention Mask Transformer architecture
Instance query mechanism for localization
🔎 Similar Papers
No similar papers found.