🤖 AI Summary
Existing text-guided image editing methods struggle to transfer complex textures—such as clouds, flames, and marble—due to severe entanglement between texture and semantic content representations, coupled with reliance on fine-tuning. To address this, we propose a fine-tuning-free diffusion-based framework. Our approach introduces the novel “target prompt nullification” strategy to explicitly decouple texture from content. We further design an editing-localization fusion mechanism: structural consistency is preserved via query-feature reweighting in self-attention and residual-block feature constraints; background integrity is ensured by jointly leveraging latent variables and self-attention responses. Extensive experiments on cloud, flame, and marble texture transfer demonstrate substantial improvements in shape preservation and background fidelity. Both qualitative and quantitative evaluations surpass state-of-the-art methods. Code is publicly available.
📝 Abstract
Recently, text-guided image editing has achieved significant success. However, existing methods can only apply simple textures like wood or gold when changing the texture of an object. Complex textures such as cloud or fire pose a challenge. This limitation stems from that the target prompt needs to contain both the input image content and, restricting the texture representation. In this paper, we propose TextureDiffusion, a tuning-free image editing method applied to various texture transfer. Initially, the target prompt is directly set to"", making the texture disentangled from the input image content to enhance texture representation. Subsequently, query features in self-attention and features in residual blocks are utilized to preserve the structure of the input image. Finally, to maintain the background, we introduce an edit localization technique which blends the self-attention results and the intermediate latents. Comprehensive experiments demonstrate that TextureDiffusion can harmoniously transfer various textures with excellent structure and background preservation. Code is publicly available at https://github.com/THU-CVML/TextureDiffusion