🤖 AI Summary
This work addresses the limitations of existing sequential approaches in low-light image super-resolution (LLISR), which often amplify artifacts, suppress textures, and degrade structural fidelity. To overcome these issues, the authors propose a novel “Decoupling then Perceive” (DTP) framework that explicitly disentangles luminance and texture components in the frequency domain for the first time. The framework integrates three key modules: Frequency-aware Structure Decoupling (FSD), Semantic-specific Dual-path Representation learning (SDR), and Cross-frequency Semantic Recombination (CSR), which collectively model and jointly optimize structural and perceptual information. Extensive experiments demonstrate that the proposed method significantly outperforms state-of-the-art models on mainstream LLISR benchmarks, achieving a 1.6% gain in PSNR, a 9.6% improvement in SSIM, and a 48% reduction in LPIPS, thereby enhancing both structural fidelity and perceptual quality.
📝 Abstract
Low-light image super-resolution (LLISR) is essential for restoring fine visual details and perceptual quality under insufficient illumination conditions with ubiquitous low-resolution devices. Although pioneer methods achieve high performance on single tasks, they solve both tasks in a serial manner, which inevitably leads to artifact amplification, texture suppression, and structural degradation. To address this, we propose Decoupling then Perceive (DTP), a novel frequency-aware framework that explicitly separates luminance and texture into semantically independent components, enabling specialized modeling and coherent reconstruction. Specifically, to adaptively separate the input into low-frequency luminance and high-frequency texture subspaces, we propose a Frequency-aware Structural Decoupling (FSD) mechanism, which lays a solid foundation for targeted representation learning and reconstruction. Based on the decoupled representation, a Semantics-specific Dual-path Representation (SDR) learning strategy that performs targeted enhancement and reconstruction for each frequency component is further designed, facilitating robust luminance adjustment and fine-grained texture recovery. To promote structural consistency and perceptual alignment in the reconstructed output, building upon this dual-path modeling, we further introduce a Cross-frequency Semantic Recomposition (CSR) module that selectively integrates the decoupled representations. Extensive experiments on the most widely used LLISR benchmarks demonstrate the superiority of our DTP framework, improving $+$1.6\% PSNR, $+$9.6\% SSIM, and $-$48\% LPIPS compared to the most state-of-the-art (SOTA) algorithm. Codes are released at https://github.com/JXVision/DTP.