🤖 AI Summary
This work addresses the challenge in multi-condition indoor panoramic image generation where stylistic preferences often conflict with architectural constraints, leading to geometric distortions in spatial layouts. To resolve this, the authors propose a controllable generation framework that bridges layout and style conditions through semantic prompts and employs a Prompt-LLM module to achieve cross-modal alignment. By integrating structure-aware geometric priors with a multi-condition disentanglement mechanism, the framework establishes a conflict-free control architecture that effectively isolates stylistic influences from spatial layout during generation. A multi-stage training strategy combining supervised fine-tuning and reinforcement learning enables the model to simultaneously preserve high aesthetic quality and significantly enhance structural consistency, offering a reliable solution for professional-grade panoramic indoor visualization.
📝 Abstract
In modern interior design, the generation of personalized spaces frequently necessitates a delicate balance between rigid architectural structural constraints and specific stylistic preferences. However, existing multi-condition generative frameworks often struggle to harmonize these inputs, leading to"condition conflicts"where stylistic attributes inadvertently compromise the geometric precision of the layout. To address this challenge, we present DreamHome-Pano, a controllable panoramic generation framework designed for high-fidelity interior synthesis. Our approach introduces a Prompt-LLM that serves as a semantic bridge, effectively translating layout constraints and style references into professional descriptive prompts to achieve precise cross-modal alignment. To safeguard architectural integrity during the generative process, we develop a Conflict-Free Control architecture that incorporates structural-aware geometric priors and a multi-condition decoupling strategy, effectively suppressing stylistic interference from eroding the spatial layout. Furthermore, we establish a comprehensive panoramic interior benchmark alongside a multi-stage training pipeline, encompassing progressive Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL). Experimental results demonstrate that DreamHome-Pano achieves a superior balance between aesthetic quality and structural consistency, offering a robust and professional-grade solution for panoramic interior visualization.