OP4KSR: One-Step Patch-Free 4K Super-Resolution with Periodic Artifact Suppression

📅 2026-05-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

211K/year
🤖 AI Summary
Existing 4K super-resolution diffusion models are constrained by memory limitations, relying on patch-based inference that compromises global semantic coherence, introduces spatial inconsistencies, and incurs high latency. This work proposes the first end-to-end, non-overlapping 4K super-resolution method, leveraging a Flux backbone and an F16 variational autoencoder to generate full 4096×4096 images in a single forward pass within 5.75 seconds on a single NVIDIA H20 GPU. To mitigate periodic artifacts inherent to non-overlapping inference, the approach introduces Rotary Position Embedding frequency rescaling (RFR) and a self-correlation periodicity loss (ℒ_AP). The method achieves superior perceptual quality and computational efficiency, supported by a dedicated training dataset and three newly established evaluation benchmarks.
📝 Abstract
Diffusion-based real-world image super-resolution (Real-ISR) has achieved remarkable perceptual quality; however, directly super-resolving images to 4K remains limited by extreme memory consumption. Consequently, prior methods adopt patch-based inference, sacrificing global context and introducing semantic confusion, spatial inconsistency, and severe latency. We propose OP4KSR, a one-step patch-free 4K SR approach built upon the powerful Flux backbone. By leveraging the extreme-compression F16 VAE, OP4KSR makes 4K SR inference tractable under practical GPU budgets, preserving global spatial-semantic coherence while enabling highly efficient inference. However, adapting this one-step architecture intrinsically triggers severe periodic artifacts. We trace this to a RoPE base frequency allocation mismatch and intra-token spatial ambiguity, both exacerbated by the lack of iterative refinement. To suppress these artifacts, we couple RoPE base frequency rescaling (RFR) with an autocorrelation-based periodicity loss ($\mathcal{L}_\text{AP}$). Furthermore, we curate a dedicated training dataset alongside three benchmarks (one synthetic and two real-world) to advance 4K SR research. Extensive experiments demonstrate that OP4KSR achieves competitive perceptual quality with efficient inference, generating a $4096\times4096$ output in only 5.75 seconds on a single NVIDIA H20 GPU.
Problem

Research questions and friction points this paper is trying to address.

4K super-resolution
periodic artifacts
patch-free inference
real-world image super-resolution
memory consumption
Innovation

Methods, ideas, or system contributions that make the work stand out.

One-step super-resolution
Patch-free inference
Periodic artifact suppression
RoPE frequency rescaling
4K image reconstruction
🔎 Similar Papers
No similar papers found.
C
Chengyan Deng
School of Automation Engineering, University of Electronic Science and Technology of China
P
Pengbin Yu
OPPO AI Center, OPPO Inc.
Z
Zhentao Chen
OPPO AI Center, OPPO Inc.
W
Wei Shen
OPPO AI Center, OPPO Inc.
Kai Zhang
Kai Zhang
Associate Professor, Nanjing University (Suzhou)
Image RestorationInverse ProblemsComputational ImagingComputer VisionLow-Level Vision
M
Meng Li
OPPO AI Center, OPPO Inc.
L
Lunxi Yuan
OPPO AI Center, OPPO Inc.
X
Xue Zhou
School of Automation Engineering, University of Electronic Science and Technology of China
L
Li Yu
School of Automation Engineering, University of Electronic Science and Technology of China