ConceptPrism: Concept Disentanglement in Personalized Diffusion Models via Residual Token Optimization

📅 2026-02-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Personalized text-to-image generation models often face a trade-off between concept fidelity and textual alignment due to the entanglement of irrelevant residual information from reference images. To address this, this work proposes ConceptPrism, an unsupervised concept disentanglement method that operates without manual guidance such as linguistic prompts or segmentation masks. ConceptPrism automatically distinguishes shared visual concepts from image-specific residuals within a set of reference images and introduces a novel exclusion loss that actively drives residual tokens to discard shared semantics, thereby enabling concept tokens to represent the core content in a purified manner. Within a diffusion-based joint optimization framework, reconstruction and exclusion losses are simultaneously leveraged to implicitly separate conceptual and residual information. Experiments demonstrate that ConceptPrism effectively mitigates concept entanglement, achieving high fidelity while significantly improving text alignment.

Technology Category

Application Category

📝 Abstract
Personalized text-to-image generation suffers from concept entanglement, where irrelevant residual information from reference images is captured, leading to a trade-off between concept fidelity and text alignment. Recent disentanglement approaches attempt to solve this utilizing manual guidance, such as linguistic cues or segmentation masks, which limits their applicability and fails to fully articulate the target concept. In this paper, we propose ConceptPrism, a novel framework that automatically disentangles the shared visual concept from image-specific residuals by comparing images within a set. Our method jointly optimizes a target token and image-wise residual tokens using two complementary objectives: a reconstruction loss to ensure fidelity, and a novel exclusion loss that compels residual tokens to discard the shared concept. This process allows the target token to capture the pure concept without direct supervision. Extensive experiments demonstrate that ConceptPrism effectively resolves concept entanglement, achieving a significantly improved trade-off between fidelity and alignment.
Problem

Research questions and friction points this paper is trying to address.

concept entanglement
personalized text-to-image generation
residual information
concept fidelity
text alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

concept disentanglement
personalized diffusion models
residual token optimization
exclusion loss
text-to-image generation
🔎 Similar Papers
No similar papers found.