RPT-SR: Regional Prior attention Transformer for infrared image Super-Resolution

📅 2026-02-17

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Existing general-purpose super-resolution models struggle to effectively leverage stable spatial scene priors in fixed-view infrared imaging, limiting their performance. To address this, this work proposes a novel Vision Transformer architecture that introduces, for the first time in infrared image super-resolution, an explicit regional prior attention mechanism. The design employs dual tokens—learnable regional prior tokens and local content tokens—to jointly model global structural memory and fine-grained local details. The proposed method demonstrates strong performance across both long-wave (LWIR) and short-wave (SWIR) infrared bands, achieving state-of-the-art results on multiple infrared datasets and confirming its generality and effectiveness.

Technology Category

Application Category

📝 Abstract

General-purpose super-resolution models, particularly Vision Transformers, have achieved remarkable success but exhibit fundamental inefficiencies in common infrared imaging scenarios like surveillance and autonomous driving, which operate from fixed or nearly-static viewpoints. These models fail to exploit the strong, persistent spatial priors inherent in such scenes, leading to redundant learning and suboptimal performance. To address this, we propose the Regional Prior attention Transformer for infrared image Super-Resolution (RPT-SR), a novel architecture that explicitly encodes scene layout information into the attention mechanism. Our core contribution is a dual-token framework that fuses (1) learnable, regional prior tokens, which act as a persistent memory for the scene's global structure, with (2) local tokens that capture the frame-specific content of the current input. By utilizing these tokens into an attention, our model allows the priors to dynamically modulate the local reconstruction process. Extensive experiments validate our approach. While most prior works focus on a single infrared band, we demonstrate the broad applicability and versatility of RPT-SR by establishing new state-of-the-art performance across diverse datasets covering both Long-Wave (LWIR) and Short-Wave (SWIR) spectra

Problem

Research questions and friction points this paper is trying to address.

infrared image super-resolution

spatial priors

fixed viewpoint

Vision Transformers

scene layout

Innovation

Methods, ideas, or system contributions that make the work stand out.

Regional Prior Tokens

Infrared Image Super-Resolution

Vision Transformer