InstructAttribute: Fine-grained Object Attributes editing with Instruction

📅 2025-05-01

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address imprecise fine-grained attribute (e.g., color, material) control, structural distortion, and poor global consistency in text-to-image diffusion models, this paper proposes SPAA—a training-free method for object-level attribute editing. SPAA introduces a novel joint editing mechanism that simultaneously manipulates self-attention maps and cross-attention values, enabling precise, instruction-driven attribute modification at the object level. We construct the first comprehensive Attribute Dataset covering all color–material combinations and design an MLLM-powered automated annotation pipeline. SPAA is validated across mainstream T2I diffusion models. Experiments demonstrate that SPAA significantly outperforms existing instruction-based editing methods while preserving object structural integrity and global image coherence, achieving state-of-the-art performance on fine-grained attribute editing tasks.

Technology Category

Application Category

📝 Abstract

Text-to-image (T2I) diffusion models, renowned for their advanced generative abilities, are extensively utilized in image editing applications, demonstrating remarkable effectiveness. However, achieving precise control over fine-grained attributes still presents considerable challenges. Existing image editing techniques either fail to modify the attributes of an object or struggle to preserve its structure and maintain consistency in other areas of the image. To address these challenges, we propose the Structure-Preserving and Attribute Amplification (SPAA), a training-free method which enables precise control over the color and material transformations of objects by editing the self-attention maps and cross-attention values. Furthermore, we constructed the Attribute Dataset, which encompasses nearly all colors and materials associated with various objects, by integrating multimodal large language models (MLLM) to develop an automated pipeline for data filtering and instruction labeling. Training on this dataset, we present our InstructAttribute, an instruction-based model designed to facilitate fine-grained editing of color and material attributes. Extensive experiments demonstrate that our method achieves superior performance in object-level color and material editing, outperforming existing instruction-based image editing approaches.

Problem

Research questions and friction points this paper is trying to address.

Precise control over fine-grained object attributes in T2I models

Preserving object structure and image consistency during editing

Automated dataset creation for color and material attribute editing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free SPAA method for precise attribute control

Automated Attribute Dataset creation using MLLM

InstructAttribute model for fine-grained editing

🔎 Similar Papers

No similar papers found.

Authors to Follow