๐ค AI Summary
Existing exemplar-guided image editing methods either rely on costly pre-trained models or adopt training-free inversion strategies that suffer from poor editing quality and low efficiency. This paper proposes ReInversionโa training-free, invertible inversion framework that achieves precise latent-space alignment between source and reference images via two-stage conditional denoising. It further introduces a mask-guided selective denoising mechanism to ensure accurate editing region localization while preserving background structural integrity. To the best of our knowledge, ReInversion is the first method enabling high-quality, efficient exemplar-guided editing entirely without training or fine-tuning. It achieves state-of-the-art editing fidelity across multiple benchmarks while significantly reducing computational overhead and maintaining high reconstruction fidelity.
๐ Abstract
Exemplar-guided Image Editing (EIE) aims to modify a source image according to a visual reference. Existing approaches often require large-scale pre-training to learn relationships between the source and reference images, incurring high computational costs. As a training-free alternative, inversion techniques can be used to map the source image into a latent space for manipulation. However, our empirical study reveals that standard inversion is sub-optimal for EIE, leading to poor quality and inefficiency. To tackle this challenge, we introduce extbf{Reversible Inversion ({ReInversion})} for effective and efficient EIE. Specifically, ReInversion operates as a two-stage denoising process, which is first conditioned on the source image and subsequently on the reference. Besides, we introduce a Mask-Guided Selective Denoising (MSD) strategy to constrain edits to target regions, preserving the structural consistency of the background. Both qualitative and quantitative comparisons demonstrate that our ReInversion method achieves state-of-the-art EIE performance with the lowest computational overhead.