LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas

📅 2025-10-23

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Existing personalized text-to-image generation methods struggle with spatial composition control over multiple subjects and suffer from poor scalability. This paper proposes a hierarchical canvas representation coupled with a lightweight locking mechanism, enabling interactive editing of subject positions, scales, and layering orders without modifying the architecture of pre-trained diffusion models. By integrating spatially aware positional embeddings and a customized data sampling strategy, our approach achieves occlusion-free synthesis and high-fidelity identity preservation. The method exhibits both contextual adaptability and intuitive layer-based manipulation. Experiments demonstrate significant improvements in compositional controllability and identity consistency for multi-subject generation—outperforming state-of-the-art methods—and confirm its effectiveness for personalized image creation in complex scenes.

Technology Category

Application Category

📝 Abstract

Despite their impressive visual fidelity, existing personalized generative models lack interactive control over spatial composition and scale poorly to multiple subjects. To address these limitations, we present LayerComposer, an interactive framework for personalized, multi-subject text-to-image generation. Our approach introduces two main contributions: (1) a layered canvas, a novel representation in which each subject is placed on a distinct layer, enabling occlusion-free composition; and (2) a locking mechanism that preserves selected layers with high fidelity while allowing the remaining layers to adapt flexibly to the surrounding context. Similar to professional image-editing software, the proposed layered canvas allows users to place, resize, or lock input subjects through intuitive layer manipulation. Our versatile locking mechanism requires no architectural changes, relying instead on inherent positional embeddings combined with a new complementary data sampling strategy. Extensive experiments demonstrate that LayerComposer achieves superior spatial control and identity preservation compared to the state-of-the-art methods in multi-subject personalized image generation.

Problem

Research questions and friction points this paper is trying to address.

Enables interactive spatial control in personalized text-to-image generation

Solves multi-subject composition without occlusion through layered representation

Preserves subject fidelity while adapting layers to surrounding context

Innovation

Methods, ideas, or system contributions that make the work stand out.

Layered canvas enables occlusion-free multi-subject composition

Locking mechanism preserves selected layers with high fidelity

Positional embeddings and data sampling enable flexible adaptation

🔎 Similar Papers

No similar papers found.