WaveHiT-SR: Hierarchical Wavelet Network for Efficient Image Super-Resolution

📅 2025-08-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Transformer-based super-resolution models are constrained by the quadratic computational complexity of windowed self-attention, forcing the use of small, fixed-size windows and limiting receptive field coverage. Method: We propose WaveFormer, a wavelet-transform-based hierarchical Transformer network that integrates multi-scale wavelet subband decomposition with an adaptive hierarchical windowing mechanism to jointly optimize global structural modeling and local texture recovery; it further employs a progressive reconstruction strategy to reduce FLOPs and parameters while enhancing long-range dependency modeling. Contribution/Results: Extensive experiments demonstrate that WaveFormer significantly outperforms SwinIR-Light, SwinIR-NG, and SRFormer-Light across multiple benchmarks, achieving both superior reconstruction quality and faster inference speed—establishing a new paradigm for efficient image super-resolution.

Technology Category

Application Category

📝 Abstract
Transformers have demonstrated promising performance in computer vision tasks, including image super-resolution (SR). The quadratic computational complexity of window self-attention mechanisms in many transformer-based SR methods forces the use of small, fixed windows, limiting the receptive field. In this paper, we propose a new approach by embedding the wavelet transform within a hierarchical transformer framework, called (WaveHiT-SR). First, using adaptive hierarchical windows instead of static small windows allows to capture features across different levels and greatly improve the ability to model long-range dependencies. Secondly, the proposed model utilizes wavelet transforms to decompose images into multiple frequency subbands, allowing the network to focus on both global and local features while preserving structural details. By progressively reconstructing high-resolution images through hierarchical processing, the network reduces computational complexity without sacrificing performance. The multi-level decomposition strategy enables the network to capture fine-grained information in lowfrequency components while enhancing high-frequency textures. Through extensive experimentation, we confirm the effectiveness and efficiency of our WaveHiT-SR. Our refined versions of SwinIR-Light, SwinIR-NG, and SRFormer-Light deliver cutting-edge SR results, achieving higher efficiency with fewer parameters, lower FLOPs, and faster speeds.
Problem

Research questions and friction points this paper is trying to address.

Enhancing image super-resolution with hierarchical wavelet transformers
Reducing computational complexity in transformer-based SR methods
Capturing global and local features while preserving details
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical transformer with adaptive windows
Wavelet transform for multi-frequency decomposition
Progressive reconstruction reducing computational complexity
🔎 Similar Papers
No similar papers found.