🤖 AI Summary
This work addresses a key limitation of existing training-free object removal methods, which indiscriminately suppress target regions in self-attention layers and often degrade background reconstruction quality. To overcome this, the authors propose an adaptive attention suppression framework that progressively removes objects during the diffusion denoising process by dynamically modulating self-attention strength based on the estimated presence probability of the target concept. The method introduces a novel timestep-aware, token-level adaptive suppression strategy that integrates temporal-step-sensitive attention analysis with dynamic modulation, enabling high-quality object removal without any additional training. Experimental results demonstrate that the proposed approach significantly outperforms current training-free techniques across multiple metrics and even surpasses several training-based methods, achieving superior photorealism and structural consistency in the inpainted images.
📝 Abstract
Object removal aims to eliminate specified objects from images while plausibly inpainting the affected regions with background content. Current training-free methods typically block attention to object regions within self-attention layers during the image generation process, leveraging surrounding background information to restore the image. However, indiscriminate suppression of self-attention in the vacated areas can degrade generation quality, as the model must simultaneously reconstruct background content in these regions. To solve this conflict, we propose AdaEraser, an adaptive framework that dynamically modulates attention based on the estimated presence of target object concepts. Through analysis of self-attention map evolution across denoising timesteps before and during removal, we develop a token-wise adaptive attention suppression strategy. This approach enables progressive perception of object removal throughout the denoising process, with the suppression strength in self-attention layers adjusted adaptively. Extensive experiments demonstrate that AdaEraser achieves superior performance in object removal, outperforming even training-based methods.