🤖 AI Summary
Existing super-resolution (SR) methods neglect the inevitable re-compression step during storage or transmission, leading to additional compression artifacts in reconstructed images after codec processing. To address this, we propose the first re-compression-aware SR framework: we model image compression as a conditional text-to-image generation task and leverage pre-trained diffusion models to construct a differentiable, generalizable codec simulator. We further design a perception-oriented end-to-end optimization strategy, using lightly compressed images as supervision targets. Our method is compatible with multiple standardized codecs—including H.264, H.265, and H.266—enabling joint optimization of SR and post-processing modules. Experiments demonstrate that, while preserving perceptual quality, our approach achieves over 10% bitrate reduction compared to Real-ESRGAN and S3Diff. This marks a significant advance in bridging the gap between SR reconstruction and practical deployment constraints.
📝 Abstract
Perceptual image super-resolution (SR) methods restore degraded images and produce sharp outputs. In practice, those outputs are usually recompressed for storage and transmission. Ignoring recompression is suboptimal as the downstream codec might add additional artifacts to restored images. However, jointly optimizing SR and recompression is challenging, as the codecs are not differentiable and vary in configuration. In this paper, we present Versatile Recompression-Aware Perceptual Super-Resolution (VRPSR), which makes existing perceptual SR aware of versatile compression. First, we formulate compression as conditional text-to-image generation and utilize a pre-trained diffusion model to build a generalizable codec simulator. Next, we propose a set of training techniques tailored for perceptual SR, including optimizing the simulator using perceptual targets and adopting slightly compressed images as the training target. Empirically, our VRPSR saves more than 10% bitrate based on Real-ESRGAN and S3Diff under H.264/H.265/H.266 compression. Besides, our VRPSR facilitates joint optimization of the SR and post-processing model after recompression.