SolarFCD: A Large-Scale Dataset and Benchmark for Solar Fault Classification in Photovoltaic Systems

📅 2026-04-26

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This study addresses the critical limitation in photovoltaic defect detection research—the scarcity of large-scale, multimodal, publicly annotated datasets—by introducing SolarFCD, the first unified dataset that systematically integrates RGB (including drone-captured) and thermal infrared images across four defect categories. Through label mapping, near-duplicate sample removal, and minority-class augmentation, the dataset achieves cross-modal alignment and standardized data curation, accompanied by a reproducible train/validation/test split. Leveraging SolarFCD, the authors benchmark 16 state-of-the-art classification models spanning five architectural families. ResNet101V2 emerges as the top performer, achieving 86.68% accuracy and an F1 score of 88.17%, with balanced performance across all four defect types (variation <1.2 percentage points), thereby establishing the first multimodal benchmark for photovoltaic defect detection.

Technology Category

Application Category

📝 Abstract

The increasing global deployment of solar photovoltaic (PV) systems needs robust, scalable, and automated inspection technologies capable of detecting a wide range of panel flaws under a variety of operating situations. The lack of large-scale, multi-modal, publicly available annotated datasets is a major obstacle preventing advancement in this field. We introduce SolarFCD, an extensive dataset of solar panel defects created by methodically combining and reconciling three publicly accessible datasets covering two imaging modalities: RGB/Drone images and Thermal Infrared. The dataset consist of 4,435 images arranged under four unified defect classes such as: healthy images, Surface Obstruction, structural fault, and electrical fault. The dataset was divided into training, validation, and test splits at an 80:10:10 ratio through methodical label mapping, near-duplicate removal, and targeted augmentation of minority classes. Sixteen classification architectures from five design families were trained and assessed on the dataset to provide repeatable benchmark baselines. With an accuracy of 86.68%, precision of 88.65%, recall of 88.62%, and F1-score of 88.17%, ResNet101V2 performed the best overall. Per-class results showed balanced detection across all four defect categories within a narrow performance band of less than 1.2 percentage points. To promote open and repeatable research in automated PV inspection and solar energy operations and maintenance, the dataset, annotation files, and baseline code are made openly available.

Problem

Research questions and friction points this paper is trying to address.

solar photovoltaic systems

fault classification

defect detection

large-scale dataset

automated inspection

Innovation

Methods, ideas, or system contributions that make the work stand out.

SolarFCD

multi-modal dataset

photovoltaic fault classification