WaveSFNet: A Wavelet-Based Codec and Spatial--Frequency Dual-Domain Gating Network for Spatiotemporal Prediction

📅 2026-03-24

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses the challenge in unsupervised spatiotemporal prediction of simultaneously modeling long-range dynamics and preserving high-frequency details. To this end, we propose WaveSFNet, a novel framework that uniquely integrates a wavelet-based encoder-decoder architecture with a spatial-frequency dual-domain gating mechanism. By leveraging wavelet downsampling to retain high-frequency subband information, incorporating inter-frame differencing to enhance dynamic features, and jointly applying large-kernel convolutions for local spatial modeling and global frequency-domain modulation, our method effectively overcomes the texture loss commonly induced by conventional downsampling strategies. Extensive experiments demonstrate that WaveSFNet achieves high-accuracy multi-step predictions on Moving MNIST, TaxiBJ, and WeatherBench benchmarks while maintaining low computational complexity.

Technology Category

Application Category

📝 Abstract

Spatiotemporal predictive learning aims to forecast future frames from historical observations in an unsupervised manner, and is critical to a wide range of applications. The key challenge is to model long-range dynamics while preserving high-frequency details for sharp multi-step predictions. Existing efficient recurrent-free frameworks typically rely on strided convolutions or pooling for sampling, which tends to discard textures and boundaries, while purely spatial operators often struggle to balance local interactions with global propagation. To address these issues, we propose WaveSFNet, an efficient framework that unifies a wavelet-based codec with a spatial--frequency dual-domain gated spatiotemporal translator. The wavelet-based codec preserves high-frequency subband cues during downsampling and reconstruction. Meanwhile, the translator first injects adjacent-frame differences to explicitly enhance dynamic information, and then performs dual-domain gated fusion between large-kernel spatial local modeling and frequency-domain global modulation, together with gated channel interaction for cross-channel feature exchange. Extensive experiments demonstrate that WaveSFNet achieves competitive prediction accuracy on Moving MNIST, TaxiBJ, and WeatherBench, while maintaining low computational complexity. Our code is available at https://github.com/fhjdqaq/WaveSFNet.

Problem

Research questions and friction points this paper is trying to address.

spatiotemporal prediction

high-frequency details

long-range dynamics

unsupervised learning

multi-step forecasting

Innovation

Methods, ideas, or system contributions that make the work stand out.

wavelet-based codec

spatial-frequency dual-domain gating

spatiotemporal prediction