🤖 AI Summary
This work addresses the end-to-end generation of physically feasible floor plans from natural language descriptions. We propose a two-stage framework: (1) a chain-of-thought (CoT)-prompted large language model (LLM) parses user requirements and generates a semantically consistent initial layout; (2) a conditional diffusion model refines this layout by jointly optimizing text-layout cross-modal alignment and physical constraint modeling, yielding geometrically accurate and structurally compliant final floor plans. Our key innovation lies in the first integration of CoT prompting with diffusion-based generation, enabling synergistic enhancement of semantic understanding and spatial reasoning. Evaluated on multiple metrics, our method achieves state-of-the-art performance—significantly outperforming existing approaches—with substantial improvements in geometric accuracy and requirement fidelity. The source code will be made publicly available.
📝 Abstract
This paper proposes a two-phase text-to-floorplan generation method, which guides a Large Language Model (LLM) to generate an initial layout (Layout-LLM) and refines them into the final floorplans through conditional diffusion model. We incorporate a Chain-of-Thought approach to prompt the LLM based on user text specifications, enabling a more user-friendly and intuitive house layout design. This method allows users to describe their needs in natural language, enhancing accessibility and providing clearer geometric constraints. The final floorplans generated by Layout-LLM through conditional diffusion refinement are more accurate and better meet user requirements. Experimental results demonstrate that our approach achieves state-of-the-art performance across all metrics, validating its effectiveness in practical home design applications. We plan to release our code for public use.