FREE-Edit: Using Editing-aware Injection in Rectified Flow Models for Zero-shot Image-Driven Video Editing

📅 2026-03-01

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This work addresses the challenge of propagating image-based edits from the first frame to subsequent frames in video editing, where improper attention injection often leads to semantic inconsistencies or loss of original video characteristics. To this end, the authors propose a novel editing-aware attention injection mechanism, termed REE, which dynamically modulates token-wise injection strength using the initial edit mask and optical flow–guided temporal propagation. Within edited regions, the method suppresses source feature injection to preserve edit fidelity while maintaining motion and structural coherence across frames. Built upon Rectified Flow, the framework enables zero-shot video editing without any training or fine-tuning. Extensive experiments demonstrate that REE consistently produces high-quality results across diverse image-driven video editing tasks, significantly outperforming existing approaches.

Technology Category

Application Category

📝 Abstract

Image-driven video editing aims to propagate edit contents from the modified first frame to the rest frames. The existing methods usually invert the source video to noise using a pre-trained image-to-video (I2V) model and then guide the sampling process using the edited first frame. Generally, a popular choice for maintaining motion and layout from the source video is intervening in the denoising process by injecting attention during reconstruction. However, such injection often leads to unsatisfactory results, where excessive injection leads to conflicting semantics from the source video while insufficient injection brings limited source representation. Recognizing this, we propose an Editing-awaRE (REE) injection method to modulate injection intensity of each token. Specifically, we first compute the pixel difference between the source and edited first frame to form a corresponding editing mask. Next, we track the editing area throughout the entire video by using optical flow to warp the first-frame mask. Then, editing-aware feature injection intensity for each token is generated accordingly, where injection is not conducted on editing areas. Building upon REE injection, we further propose a zero-shot image-driven video editing framework with recent-emerging rectified-Flow models, dubbed FREE-Edit. Without fine-tuning or training, our FREE-Edit demonstrates effectiveness in various image-driven video editing scenarios, showing its capability to produce higher-quality outputs compared with existing techniques. Project page: https://free-edit.github.io/page/.

Problem

Research questions and friction points this paper is trying to address.

image-driven video editing

feature injection

motion preservation

layout consistency

zero-shot editing

Innovation

Methods, ideas, or system contributions that make the work stand out.

editing-aware injection

rectified flow

zero-shot video editing