SolarFCD: A Large-Scale Dataset and Benchmark for Solar Fault Classification in Photovoltaic Systems

📅 2026-04-26
📈 Citations: 0
Influential: 0
📄 PDF

career value

201K/year
🤖 AI Summary
This study addresses the critical limitation in photovoltaic defect detection research—the scarcity of large-scale, multimodal, publicly annotated datasets—by introducing SolarFCD, the first unified dataset that systematically integrates RGB (including drone-captured) and thermal infrared images across four defect categories. Through label mapping, near-duplicate sample removal, and minority-class augmentation, the dataset achieves cross-modal alignment and standardized data curation, accompanied by a reproducible train/validation/test split. Leveraging SolarFCD, the authors benchmark 16 state-of-the-art classification models spanning five architectural families. ResNet101V2 emerges as the top performer, achieving 86.68% accuracy and an F1 score of 88.17%, with balanced performance across all four defect types (variation <1.2 percentage points), thereby establishing the first multimodal benchmark for photovoltaic defect detection.

Technology Category

Application Category

📝 Abstract
The increasing global deployment of solar photovoltaic (PV) systems needs robust, scalable, and automated inspection technologies capable of detecting a wide range of panel flaws under a variety of operating situations. The lack of large-scale, multi-modal, publicly available annotated datasets is a major obstacle preventing advancement in this field. We introduce SolarFCD, an extensive dataset of solar panel defects created by methodically combining and reconciling three publicly accessible datasets covering two imaging modalities: RGB/Drone images and Thermal Infrared. The dataset consist of 4,435 images arranged under four unified defect classes such as: healthy images, Surface Obstruction, structural fault, and electrical fault. The dataset was divided into training, validation, and test splits at an 80:10:10 ratio through methodical label mapping, near-duplicate removal, and targeted augmentation of minority classes. Sixteen classification architectures from five design families were trained and assessed on the dataset to provide repeatable benchmark baselines. With an accuracy of 86.68%, precision of 88.65%, recall of 88.62%, and F1-score of 88.17%, ResNet101V2 performed the best overall. Per-class results showed balanced detection across all four defect categories within a narrow performance band of less than 1.2 percentage points. To promote open and repeatable research in automated PV inspection and solar energy operations and maintenance, the dataset, annotation files, and baseline code are made openly available.
Problem

Research questions and friction points this paper is trying to address.

solar photovoltaic systems
fault classification
defect detection
large-scale dataset
automated inspection
Innovation

Methods, ideas, or system contributions that make the work stand out.

SolarFCD
multi-modal dataset
photovoltaic fault classification
benchmark
open-source
M
Misbah Ijaz
Department of Computer Science, University of Gujrat, Gujrat, 51700, Pakistan
S
Saif Ur Rehman Khan
Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany
A
Abd Ur Rehman
Department of Computer Science, University of Gujrat, Gujrat, 51700, Pakistan
A
Arooj Zaib
Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, Kaiserslautern, 67663, Germany
S
Sebastian Vollmer
German Research Center for Artificial Intelligence, Kaiserslautern, 67663, Germany
Andreas Dengel
Andreas Dengel
Professor of Computer Science, University of Kaiserslautern & Executive Director, DFKI
Artificial IntelligenceMachine LearningDocument AnalysisSemantic Technologies
Muhammad Nabeel Asim
Muhammad Nabeel Asim
German Research Center for Artificial Intelligence
Artificial Intelligence