Align & Invert: Solving Inverse Problems with Diffusion and Flow-based Models via Representational Alignment

📅 2025-11-20

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Balancing reconstruction quality and inference efficiency remains challenging in image inverse problems. To address this, we propose Representation Alignment Regularization (REPA), a plug-and-play method that aligns the latent representations of diffusion or flow-based generative models with semantic features extracted by a pre-trained self-supervised encoder (e.g., DINOv2) during inference—requiring neither paired ground-truth data nor additional model training. By imposing explicit embedding-space regularization, REPA steers the generative prior toward semantically consistent representations of clean images, uncovering an intrinsic link between this regularization mechanism and improved perceptual quality. Extensive experiments on super-resolution, inpainting, and Gaussian/motion deblurring demonstrate that REPA consistently enhances both PSNR and LPIPS metrics while reducing sampling steps by over 50%. Moreover, it seamlessly integrates with mainstream inverse problem solvers, achieving a unified trade-off between high-fidelity reconstruction and computational efficiency.

Technology Category

Application Category

📝 Abstract

Enforcing alignment between the internal representations of diffusion or flow-based generative models and those of pretrained self-supervised encoders has recently been shown to provide a powerful inductive bias, improving both convergence and sample quality. In this work, we extend this idea to inverse problems, where pretrained generative models are employed as priors. We propose applying representation alignment (REPA) between diffusion or flow-based models and a pretrained self-supervised visual encoder, such as DINOv2, to guide the reconstruction process at inference time. Although ground-truth signals are unavailable in inverse problems, we show that aligning model representations with approximate target features can substantially enhance reconstruction fidelity and perceptual realism. We provide theoretical results showing (a) the relation between the REPA regularization and a divergence measure in the DINOv2 embedding space, and (b) how REPA updates steer the model's internal representations toward those of the clean image. These results offer insights into the role of REPA in improving perceptual fidelity. Finally, we demonstrate the generality of our approach by integrating it into multiple state-of-the-art inverse problem solvers. Extensive experiments on super-resolution, box inpainting, Gaussian deblurring, and motion deblurring confirm that our method consistently improves reconstruction quality across tasks, while also providing substantial efficiency gains by reducing the number of required discretization steps without compromising the performance of the underlying solver.

Problem

Research questions and friction points this paper is trying to address.

Improving inverse problem solving using diffusion and flow-based models

Enhancing reconstruction fidelity through representation alignment with encoders

Applying alignment regularization to super-resolution and deblurring tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Aligns diffusion models with self-supervised encoders

Uses representation alignment to guide reconstruction

Improves fidelity and efficiency in inverse problems

🔎 Similar Papers

Gaussian is All You Need: A Unified Framework for Solving Inverse Problems via Diffusion Posterior Sampling