🤖 AI Summary
Existing air quality forecasting models exhibit limited capability in predicting extreme PM pollution events triggered by wildfires, haze, and dust storms. To address this, we propose SynCast—a novel deep learning framework integrating a region-adaptive Transformer architecture with a diffusion-based stochastic refinement module, and incorporating an extreme-value-theory-driven, domain-aware loss function for multivariate joint forecasting of PM₁, PM₂.₅, and PM₁₀. SynCast fuses ERA5 meteorological reanalysis and CAMS atmospheric composition reanalysis data, enabling globally scalable, high-resolution predictions. Experimental results demonstrate that SynCast consistently outperforms state-of-the-art methods across all PM metrics, particularly excelling in forecasting hazardous peak concentrations in high-impact regions. It significantly improves both forecast accuracy and early-warning lead time for extreme pollution episodes. As a deployable, next-generation air quality forecasting system, SynCast advances public health risk mitigation through reliable, physically informed, and statistically robust extreme-event prediction.
📝 Abstract
Air pollution remains a leading global health and environmental risk, particularly in regions vulnerable to episodic air pollution spikes due to wildfires, urban haze and dust storms. Accurate forecasting of particulate matter (PM) concentrations is essential to enable timely public health warnings and interventions, yet existing models often underestimate rare but hazardous pollution events. Here, we present SynCast, a high-resolution neural forecasting model that integrates meteorological and air composition data to improve predictions of both average and extreme pollution levels. Built on a regionally adapted transformer backbone and enhanced with a diffusion-based stochastic refinement module, SynCast captures the nonlinear dynamics driving PM spikes more accurately than existing approaches. Leveraging on harmonized ERA5 and CAMS datasets, our model shows substantial gains in forecasting fidelity across multiple PM variables (PM$_1$, PM$_{2.5}$, PM$_{10}$), especially under extreme conditions. We demonstrate that conventional loss functions underrepresent distributional tails (rare pollution events) and show that SynCast, guided by domain-aware objectives and extreme value theory, significantly enhances performance in highly impacted regions without compromising global accuracy. This approach provides a scalable foundation for next-generation air quality early warning systems and supports climate-health risk mitigation in vulnerable regions.