DisCo-Layout: Disentangling and Coordinating Semantic and Physical Refinement in a Multi-Agent Framework for 3D Indoor Layout Synthesis

📅 2025-10-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D indoor layout generation methods suffer from either poor generalization (traditional approaches) or insufficient physical plausibility (LLM/VLM-driven methods). To address these limitations, we propose the first multi-agent framework that decouples and synergistically optimizes semantic understanding and physical constraints: a Planner invokes a semantic optimization tool to refine abstract spatial relationships; a Designer employs a physics-aware optimization tool—based on grid matching—to resolve geometric conflicts; and an Evaluator provides closed-loop feedback. The two optimization tools operate independently and iteratively, coordinated dynamically via a task-scheduling mechanism. This design enables the first divide-and-conquer joint optimization of semantic comprehension and geometric constraints. Evaluated on standard benchmarks, our method achieves comprehensive SOTA performance, with significant improvements in layout合理性 (physical plausibility), visual realism, and cross-scene generalization—enabling high-fidelity construction of complex indoor virtual environments.

Technology Category

Application Category

📝 Abstract
3D indoor layout synthesis is crucial for creating virtual environments. Traditional methods struggle with generalization due to fixed datasets. While recent LLM and VLM-based approaches offer improved semantic richness, they often lack robust and flexible refinement, resulting in suboptimal layouts. We develop DisCo-Layout, a novel framework that disentangles and coordinates physical and semantic refinement. For independent refinement, our Semantic Refinement Tool (SRT) corrects abstract object relationships, while the Physical Refinement Tool (PRT) resolves concrete spatial issues via a grid-matching algorithm. For collaborative refinement, a multi-agent framework intelligently orchestrates these tools, featuring a planner for placement rules, a designer for initial layouts, and an evaluator for assessment. Experiments demonstrate DisCo-Layout's state-of-the-art performance, generating realistic, coherent, and generalizable 3D indoor layouts. Our code will be publicly available.
Problem

Research questions and friction points this paper is trying to address.

Improving generalization in 3D indoor layout synthesis
Resolving semantic and physical refinement limitations
Enhancing layout realism and coherence through multi-agent coordination
Innovation

Methods, ideas, or system contributions that make the work stand out.

Disentangles semantic and physical refinement processes
Uses multi-agent framework for collaborative tool coordination
Employs grid-matching algorithm for spatial optimization
🔎 Similar Papers
No similar papers found.
Jialin Gao
Jialin Gao
National University of Singapore
Video Understanding Multi-modal Understanding
Donghao Zhou
Donghao Zhou
The Chinese University of Hong Kong
Machine LearningComputer Vision
M
Mingjian Liang
Monash University
Lihao Liu
Lihao Liu
Amazon
LLM-based AgentHealthcare AI
C
Chi-Wing Fu
The Chinese University of Hong Kong
X
Xiaowei Hu
South China University of Technology
P
Pheng-Ann Heng
The Chinese University of Hong Kong