LCUDiff: Latent Capacity Upgrade Diffusion for Faithful Human Body Restoration

📅 2026-02-04

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the limitations of existing human image inpainting methods in fidelity and structural coherence. To this end, we propose LCUDiff, a single-step, high-fidelity inpainting framework that extends the latent space of a pretrained diffusion model from 4 to 16 channels. Our key innovations include Channel-Splitting Distillation (CSD) to enhance fine detail reconstruction, Prior-Preserving Adaptation (PPA) to maintain consistency with the generative prior, and a quality-aware Decoder Routing mechanism (DeR) that dynamically selects the optimal decoding path. Combined with fine-tuned variational autoencoder components, LCUDiff significantly outperforms current approaches on both synthetic and real-world datasets, effectively reducing artifacts while preserving computational efficiency through single-step inference.

Technology Category

Application Category

📝 Abstract

Existing methods for restoring degraded human-centric images often struggle with insufficient fidelity, particularly in human body restoration (HBR). Recent diffusion-based restoration methods commonly adapt pre-trained text-to-image diffusion models, where the variational autoencoder (VAE) can significantly bottleneck restoration fidelity. We propose LCUDiff, a stable one-step framework that upgrades a pre-trained latent diffusion model from the 4-channel latent space to the 16-channel latent space. For VAE fine-tuning, channel splitting distillation (CSD) is used to keep the first four channels aligned with pre-trained priors while allocating the additional channels to effectively encode high-frequency details. We further design prior-preserving adaptation (PPA) to smoothly bridge the mismatch between 4-channel diffusion backbones and the higher-dimensional 16-channel latent. In addition, we propose a decoder router (DeR) for per-sample decoder routing using restoration-quality score annotations, which improves visual quality across diverse conditions. Experiments on synthetic and real-world datasets show competitive results with higher fidelity and fewer artifacts under mild degradations, while preserving one-step efficiency. The code and model will be at https://github.com/gobunu/LCUDiff.

Problem

Research questions and friction points this paper is trying to address.

Human Body Restoration

Diffusion Models

Latent Space

Image Restoration

VAE Bottleneck

Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent Diffusion

Channel Splitting Distillation

Prior-Preserving Adaptation