Training-Free Image Editing with Visual Context Integration and Concept Alignment

📅 2026-04-06

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Existing image editing methods often rely on costly training or diffusion inversion when incorporating visual context, which limits both consistency and flexibility in edits. This work proposes VicoEdit, the first framework to enable visual context injection without requiring additional training or diffusion inversion. By directly fusing visual context from a source image into a pre-trained text-guided editing model and employing a concept-aligned guidance strategy during posterior sampling, VicoEdit significantly enhances the consistency and controllability of editing outcomes. Experimental results demonstrate that VicoEdit outperforms state-of-the-art training-based approaches across multiple metrics, achieving higher-quality and more consistent image editing results.

Technology Category

Application Category

📝 Abstract

In image editing, it is essential to incorporate a context image to convey the user's precise requirements, such as subject appearance or image style. Existing training-based visual context-aware editing methods incur data collection effort and training cost. On the other hand, the training-free alternatives are typically established on diffusion inversion, which struggles with consistency and flexibility. In this work, we propose VicoEdit, a training-free and inversion-free method to inject the visual context into the pretrained text-prompted editing model. More specifically, VicoEdit directly transforms the source image into the target one based on the visual context, thereby eliminating the need for inversion that can lead to deviated trajectories. Moreover, we design a posterior sampling approach guided by concept alignment to enhance the editing consistency. Empirical results demonstrate that our training-free method achieves even better editing performance than the state-of-the-art training-based models.

Problem

Research questions and friction points this paper is trying to address.

image editing

visual context

training-free

concept alignment

diffusion inversion

Innovation

Methods, ideas, or system contributions that make the work stand out.

training-free

visual context integration

concept alignment