M3S-Net: Multimodal Feature Fusion Network Based on Multi-scale Data for Ultra-short-term PV Power Forecasting

๐Ÿ“… 2026-02-23
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This study addresses the challenge of high-frequency solar irradiance fluctuations and intermittency in high-penetration photovoltaic grids caused by rapidly moving clouds, a problem inadequately tackled by existing methods due to their inability to model fine-grained cloud optical properties and the complex spatiotemporal coupling between meteorological and visual modalities. To this end, we propose M3S-Net, which integrates a multi-scale partial channel selection network to enhance thin-cloud boundary detection accuracy, incorporates FFT-based timeโ€“frequency analysis to capture periodic patterns in meteorological data, and introduces a novel cross-modal Mamba interaction module. This module enables deep structural coupling between visual and temporal modalities through a dynamic C-matrix exchange mechanism at linear computational complexity. Evaluated on a newly constructed fine-grained photovoltaic power dataset, our method reduces the mean absolute error by 6.2% over the best baseline in 10-minute ultra-short-term forecasting.

Technology Category

Application Category

๐Ÿ“ Abstract
The inherent intermittency and high-frequency variability of solar irradiance, particularly during rapid cloud advection, present significant stability challenges to high-penetration photovoltaic grids. Although multimodal forecasting has emerged as a viable mitigation strategy, existing architectures predominantly rely on shallow feature concatenation and binary cloud segmentation, thereby failing to capture the fine-grained optical features of clouds and the complex spatiotemporal coupling between visual and meteorological modalities. To bridge this gap, this paper proposes M3S-Net, a novel multimodal feature fusion network based on multi-scale data for ultra-short-term PV power forecasting. First, a multi-scale partial channel selection network leverages partial convolutions to explicitly isolate the boundary features of optically thin clouds, effectively transcending the precision limitations of coarse-grained binary masking. Second, a multi-scale sequence to image analysis network employs Fast Fourier Transform (FFT)-based time-frequency representation to disentangle the complex periodicity of meteorological data across varying time horizons. Crucially, the model incorporates a cross-modal Mamba interaction module featuring a novel dynamic C-matrix swapping mechanism. By exchanging state-space parameters between visual and temporal streams, this design conditions the state evolution of one modality on the context of the other, enabling deep structural coupling with linear computational complexity, thus overcoming the limitations of shallow concatenation. Experimental validation on the newly constructed fine-grained PV power dataset demonstrates that M3S-Net achieves a mean absolute error reduction of 6.2% in 10-minute forecasts compared to state-of-the-art baselines. The dataset and source code will be available at https://github.com/she1110/FGPD.
Problem

Research questions and friction points this paper is trying to address.

ultra-short-term PV power forecasting
multimodal feature fusion
cloud optical features
spatiotemporal coupling
solar irradiance variability
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal feature fusion
multi-scale partial convolution
FFT-based time-frequency representation
cross-modal Mamba
dynamic C-matrix swapping
๐Ÿ”Ž Similar Papers
No similar papers found.
P
Penghui Niu
School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
Taotao Cai
Taotao Cai
University of Southern Queensland
S
Suqi Zhang
School of Information Engineering, Tianjin University of Commerce, Tianjin 300134, China
J
Junhua Gu
School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China; Hebei Province Key Laboratory of Big Data Calculation, Hebei University of Technology, Tianjin 300401, China
Ping Zhang
Ping Zhang
Hebei University of Technology
Feature selectionMachine learning
Q
Qiqi Liu
Trustworthy and General AI Lab, School of Engineering, Westlake University, Hangzhou, 310030, China
Jianxin Li
Jianxin Li
Edith Cowan University
Knowledge GraphData MiningSocial NetworkEducational Technologies