Latent Harmony: Synergistic Unified UHD Image Restoration via Latent Space Regularization and Controllable Refinement

πŸ“… 2025-10-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
UHD image restoration faces an inherent trade-off between computational efficiency and faithful high-frequency detail preservation. To address this, we propose Latent Harmonyβ€”a two-stage latent-space framework. In the first stage, we design the LH-VAE encoder, which jointly incorporates visual semantic constraints and degradation-aware perturbations to enhance discriminability of latent representations. In the second stage, we introduce HF-LoRA, a lightweight adapter module for high-frequency residual modeling, and a perception-driven LoRA-tuned decoder; fidelity-perception balance is explicitly controlled via an Ξ±-parameterized high-frequency alignment loss and selective gradient propagation. Evaluated on both UHD and standard-resolution benchmarks, Latent Harmony achieves state-of-the-art performance with significantly reduced computational overhead, yielding reconstructions that simultaneously exhibit structural accuracy and textural realism.

Technology Category

Application Category

πŸ“ Abstract
Ultra-High Definition (UHD) image restoration faces a trade-off between computational efficiency and high-frequency detail retention. While Variational Autoencoders (VAEs) improve efficiency via latent-space processing, their Gaussian constraint often discards degradation-specific high-frequency information, hurting reconstruction fidelity. To overcome this, we propose Latent Harmony, a two-stage framework that redefines VAEs for UHD restoration by jointly regularizing the latent space and enforcing high-frequency-aware reconstruction.In Stage One, we introduce LH-VAE, which enhances semantic robustness through visual semantic constraints and progressive degradation perturbations, while latent equivariance strengthens high-frequency reconstruction.Stage Two jointly trains this refined VAE with a restoration model using High-Frequency Low-Rank Adaptation (HF-LoRA): an encoder LoRA guided by a fidelity-oriented high-frequency alignment loss to recover authentic details, and a decoder LoRA driven by a perception-oriented loss to synthesize realistic textures. Both LoRA modules are trained via alternating optimization with selective gradient propagation to preserve the pretrained latent structure.At inference, a tunable parameter Ξ± enables flexible fidelity-perception trade-offs.Experiments show Latent Harmony achieves state-of-the-art performance across UHD and standard-resolution tasks, effectively balancing efficiency, perceptual quality, and reconstruction accuracy.
Problem

Research questions and friction points this paper is trying to address.

Balancing computational efficiency with high-frequency detail preservation
Overcoming Gaussian constraints in VAEs for better reconstruction fidelity
Enabling tunable trade-offs between fidelity and perceptual quality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent space regularization enhances semantic robustness
High-Frequency LoRA modules recover authentic details
Tunable parameter enables flexible fidelity-perception trade-offs
πŸ”Ž Similar Papers
2024-07-04IEEE transactions on circuits and systems for video technology (Print)Citations: 1
Yidi Liu
Yidi Liu
University of Science and Technology of China
computer vision
X
Xueyang Fu
University of Science and Technology of China
J
Jie Huang
University of Science and Technology of China
Jie Xiao
Jie Xiao
University of Science and Technology of China
low level visiongenerative modelmachine learning
D
Dong Li
University of Science and Technology of China
W
Wenlong Zhang
Shanghai AI Laboratory
Lei Bai
Lei Bai
Shanghai AI Laboratory
Foundation ModelScience IntelligenceMulti-Agent SystemAutonomous Discovery
Z
Zheng-jun Zha
University of Science and Technology of China