A Scale-Adaptive Framework for Joint Spatiotemporal Super-Resolution with Diffusion Models

📅 2026-04-23

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work proposes a scale-adaptive spatiotemporal video super-resolution framework that overcomes the limitations of existing methods, which are often confined to either spatial or temporal dimensions and struggle to generalize across diverse scale factors. The approach decouples the task into two components: an attention-based conditional mean predictor and a residual conditional diffusion model, optionally augmented with a mass-conserving transformation to preserve physical consistency. By dynamically adjusting three scale-dependent hyperparameters—noise schedule magnitude, temporal context length, and the mass conservation function—the unified architecture supports spatial upscaling factors from 1× to 25× and temporal factors from 1× to 6×. Experiments on French precipitation reanalysis data demonstrate that the method achieves strong cross-scale performance and efficient model reuse.

Technology Category

Application Category

📝 Abstract

Deep-learning video super-resolution has progressed rapidly, but climate applications typically super-resolve (increase resolution) either space or time, and joint spatiotemporal models are often designed for a single pair of super-resolution (SR) factors (upscaling spatial and temporal ratio between the low-resolution sequence and the high-resolution sequence), limiting transfer across spatial resolutions and temporal cadences (frame rates). We present a scale-adaptive framework that reuses the same architecture across factors by decomposing spatiotemporal SR into a deterministic prediction of the conditional mean, with attention, and a residual conditional diffusion model, with an optional mass-conservation (same precipitation amount in inputs and outputs) transform to preserve aggregated totals. Assuming that larger SR factors primarily increase underdetermination (hence required context and residual uncertainty) rather than changing the conditional-mean structure, scale adaptivity is achieved by retuning three factor-dependent hyperparameters before retraining: the diffusion noise schedule amplitude beta (larger for larger factors to increase diversity), the temporal context length L (set to maintain comparable attention horizons across cadences) and optionally a third, the mass-conservation function f (tapered to limit the amplification of extremes for large factors). Demonstrated on reanalysis precipitation over France (Comephore), the same architecture spans super-resolution factors from 1 to 25 in space and 1 to 6 in time, yielding a reusable architecture and tuning recipe for joint spatiotemporal super-resolution across scales.

Problem

Research questions and friction points this paper is trying to address.

spatiotemporal super-resolution

scale adaptivity

climate applications

video super-resolution

resolution factors

Innovation

Methods, ideas, or system contributions that make the work stand out.

scale-adaptive

spatiotemporal super-resolution

diffusion models