🤖 AI Summary
This work addresses the challenges in radar echo sequence prediction, including difficulties in modeling multi-scale evolution, inter-frame feature misalignment, and the trade-off between long-range spatiotemporal dependencies and spatial fidelity. To this end, the authors propose a multi-scale guided rectified flow network that integrates rectified flow training, cross-scale bidirectional feature communication, a condition-guided spatial alignment mechanism, and wavelet-guided skip connections. This design effectively preserves high-frequency details and enables few-step, high-fidelity sampling. Key components include the Wavelet-Guided Skip Connection, Feature Communication Module, Condition-Guided Spatial Transform Fusion, and a lightweight Vision-RWKV module. Extensive experiments on SEVIR, MeteoNet, Shanghai, and CIKM datasets demonstrate consistent and significant improvements over strong baselines, particularly yielding sharper echo structures under high rainfall thresholds and maintaining stable performance in long-term forecasting.
📝 Abstract
Accurate and high-resolution precipitation nowcasting from radar echo sequences is crucial for disaster mitigation and economic planning, yet it remains a significant challenge. Key difficulties include modeling complex multi-scale evolution, correcting inter-frame feature misalignment caused by displacement, and efficiently capturing long-range spatiotemporal context without sacrificing spatial fidelity. To address these issues, we present the Multi-scale Feature Communication Rectified Flow (RF) Network (MFC-RFNet), a generative framework that integrates multi-scale communication with guided feature fusion. To enhance multi-scale fusion while retaining fine detail, a Wavelet-Guided Skip Connection (WGSC) preserves high-frequency components, and a Feature Communication Module (FCM) promotes bidirectional cross-scale interaction. To correct inter-frame displacement, a Condition-Guided Spatial Transform Fusion (CGSTF) learns spatial transforms from conditioning echoes to align shallow features. The backbone adopts rectified flow training to learn near-linear probability-flow trajectories, enabling few-step sampling with stable fidelity. Additionally, lightweight Vision-RWKV (RWKV) blocks are placed at the encoder tail, the bottleneck, and the first decoder layer to capture long-range spatiotemporal dependencies at low spatial resolutions with moderate compute. Evaluations on four public datasets (SEVIR, MeteoNet, Shanghai, and CIKM) demonstrate consistent improvements over strong baselines, yielding clearer echo morphology at higher rain-rate thresholds and sustained skill at longer lead times. These results suggest that the proposed synergy of RF training with scale-aware communication, spatial alignment, and frequency-aware fusion presents an effective and robust approach for radar-based nowcasting.