🤖 AI Summary
This work addresses the challenge of effectively super-resolving infrared videos degraded by atmospheric turbulence and compression artifacts. Existing methods either neglect the modality gap between infrared and visible light or fail to jointly model turbulence-induced distortions and resolution loss. To overcome these limitations, we propose HATIR, which introduces phase response consistency in thermally active regions into turbulence-aware optical flow estimation and injects a thermal-aware deformation prior during the reverse diffusion sampling process, thereby jointly modeling the inverse dynamics of turbulence degradation and structural detail loss. Our approach integrates a turbulence-gating mechanism, structure-aware attention, and a turbulence-aware decoder to suppress temporal instability and enhance edge feature aggregation. We also present FLIR-IVSR, the first infrared video super-resolution dataset under real turbulence conditions, comprising 640 diverse scenes, and demonstrate significant improvements in both super-resolution quality and structural fidelity on real-world infrared videos.
📝 Abstract
Infrared video has been of great interest in visual tasks under challenging environments, but often suffers from severe atmospheric turbulence and compression degradation. Existing video super-resolution (VSR) methods either neglect the inherent modality gap between infrared and visible images or fail to restore turbulence-induced distortions. Directly cascading turbulence mitigation (TM) algorithms with VSR methods leads to error propagation and accumulation due to the decoupled modeling of degradation between turbulence and resolution. We introduce HATIR, a Heat-Aware Diffusion for Turbulent InfraRed Video Super-Resolution, which injects heat-aware deformation priors into the diffusion sampling path to jointly model the inverse process of turbulent degradation and structural detail loss. Specifically, HATIR constructs a Phasor-Guided Flow Estimator, rooted in the physical principle that thermally active regions exhibit consistent phasor responses over time, enabling reliable turbulence-aware flow to guide the reverse diffusion process. To ensure the fidelity of structural recovery under nonuniform distortions, a Turbulence-Aware Decoder is proposed to selectively suppress unstable temporal cues and enhance edge-aware feature aggregation via turbulence gating and structure-aware attention. We built FLIR-IVSR, the first dataset for turbulent infrared VSR, comprising paired LR-HR sequences from a FLIR T1050sc camera (1024 X 768) spanning 640 diverse scenes with varying camera and object motion conditions. This encourages future research in infrared VSR. Project page: https://github.com/JZ0606/HATIR