DreamRelation: Bridging Customization and Relation Generation

📅 2024-10-30
📈 Citations: 7
Influential: 1
📄 PDF
🤖 AI Summary
Existing personalized image generation models struggle to simultaneously preserve subject identity fidelity and accurately realize spatial/semantic relationships specified in text prompts. To address this, we propose a relation-aware personalized diffusion generation framework. First, we establish an identity–relation disentangled learning paradigm. Second, we design a keypoint-matching loss to explicitly model pose-dependent relational semantics—such as “riding,” “handing,” and “occluding.” Third, we introduce a local feature fusion mechanism leveraging image prompts to mitigate ambiguity arising from overlapping objects. Evaluated on our newly constructed relation-specific benchmark, our method achieves significant improvements in relational accuracy (+18.7%) and identity fidelity (+22.3%). It robustly supports natural synthesis of multi-object scenes involving complex relational configurations. This work establishes a novel paradigm for controllable, semantically grounded image generation.

Technology Category

Application Category

📝 Abstract
Customized image generation is essential for creating personalized content based on user prompts, allowing large-scale text-to-image diffusion models to more effectively meet individual needs. However, existing models often neglect the relationships between customized objects in generated images. In contrast, this work addresses this gap by focusing on relation-aware customized image generation, which seeks to preserve the identities from image prompts while maintaining the relationship specified in text prompts. Specifically, we introduce DreamRelation, a framework that disentangles identity and relation learning using a carefully curated dataset. Our training data consists of relation-specific images, independent object images containing identity information, and text prompts to guide relation generation. Then, we propose two key modules to tackle the two main challenges: generating accurate and natural relationships, especially when significant pose adjustments are required, and avoiding object confusion in cases of overlap. First, we introduce a keypoint matching loss that effectively guides the model in adjusting object poses closely tied to their relationships. Second, we incorporate local features of the image prompts to better distinguish between objects, preventing confusion in overlapping cases. Extensive results on our proposed benchmarks demonstrate the superiority of DreamRelation in generating precise relations while preserving object identities across a diverse set of objects and relationships.
Problem

Research questions and friction points this paper is trying to address.

Enhancing relation-aware customized image generation from text prompts
Preserving object identities while adjusting poses for relationships
Preventing object confusion in overlapping image generation scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Disentangles identity and relation learning
Uses keypoint matching loss for pose adjustment
Incorporates local features to prevent object confusion
🔎 Similar Papers
No similar papers found.