LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning

📅 2025-11-11

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

Text-driven multi-object image editing suffers from attention entanglement among objects, causing editing leakage and constraint violation—especially in overlapping regions where disentangled manipulation remains challenging. To address this, we propose a training-free, multi-layer disentanglement editing framework. Its core innovations are conflict-aware hierarchical decomposition and consistency-preserving fusion: conflicts are localized via attention-guided IoU estimation; then, temporal region removal, intra-layer text guidance, cross-layer geometric mapping, and alpha-weighted fusion jointly enforce semantic isolation and collaborative optimization at the object level. Experiments demonstrate substantial improvements in intra-object editing controllability and inter-object spatial consistency under complex scenes. Our method outperforms existing state-of-the-art approaches across comprehensive quantitative and qualitative evaluations.

Technology Category

Application Category

📝 Abstract

Text-driven multi-object image editing which aims to precisely modify multiple objects within an image based on text descriptions, has recently attracted considerable interest. Existing works primarily follow the localize-editing paradigm, focusing on independent object localization and editing while neglecting critical inter-object interactions. However, this work points out that the neglected attention entanglements in inter-object conflict regions, inherently hinder disentangled multi-object editing, leading to either inter-object editing leakage or intra-object editing constraints. We thereby propose a novel multi-layer disentangled editing framework LayerEdit, a training-free method which, for the first time, through precise object-layered decomposition and coherent fusion, enables conflict-free object-layered editing. Specifically, LayerEdit introduces a novel"decompose-editingfusion"framework, consisting of: (1) Conflict-aware Layer Decomposition module, which utilizes an attention-aware IoU scheme and time-dependent region removing, to enhance conflict awareness and suppression for layer decomposition. (2) Object-layered Editing module, to establish coordinated intra-layer text guidance and cross-layer geometric mapping, achieving disentangled semantic and structural modifications. (3) Transparency-guided Layer Fusion module, to facilitate structure-coherent inter-object layer fusion through precise transparency guidance learning. Extensive experiments verify the superiority of LayerEdit over existing methods, showing unprecedented intra-object controllability and inter-object coherence in complex multi-object scenarios. Codes are available at: https://github.com/fufy1024/LayerEdit.

Problem

Research questions and friction points this paper is trying to address.

Addresses attention entanglement in multi-object image editing

Prevents inter-object editing leakage and intra-object constraints

Enables conflict-free object-layered editing through decomposition and fusion

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-layer disentangled framework for conflict-free editing

Conflict-aware layer decomposition with attention IoU

Transparency-guided fusion for coherent multi-object integration

🔎 Similar Papers

Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit