DecompDreamer: Advancing Structured 3D Asset Generation with Multi-Object Decomposition and Gaussian Splatting

📅 2025-03-15

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

Current text-to-3D approaches struggle to model fine-grained inter-object interactions in compositional prompts involving multiple objects and spatial relationships, leading to inaccurate layouts and entangled object geometries. To address this, we propose a vision-language model (VLM)-driven progressive Gaussian rasterization framework that jointly models spatial relations among objects first, then progressively refines individual geometry and appearance—achieving unified optimization of relational awareness and fine-grained disentanglement. Our method leverages a VLM for semantic decomposition and introduces relation-guided co-training of geometry and appearance. Evaluated on multiple benchmarks, it significantly outperforms state-of-the-art methods: improving multi-object separation by +21.3%, layout accuracy by +18.7%, and editing controllability—enabling flexible compositional generation and precise local editing.

Technology Category

Application Category

📝 Abstract

Text-to-3D generation saw dramatic advances in recent years by leveraging Text-to-Image models. However, most existing techniques struggle with compositional prompts, which describe multiple objects and their spatial relationships. They often fail to capture fine-grained inter-object interactions. We introduce DecompDreamer, a Gaussian splatting-based training routine designed to generate high-quality 3D compositions from such complex prompts. DecompDreamer leverages Vision-Language Models (VLMs) to decompose scenes into structured components and their relationships. We propose a progressive optimization strategy that first prioritizes joint relationship modeling before gradually shifting toward targeted object refinement. Our qualitative and quantitative evaluations against state-of-the-art text-to-3D models demonstrate that DecompDreamer effectively generates intricate 3D compositions with superior object disentanglement, offering enhanced control and flexibility in 3D generation. Project page : https://decompdreamer3d.github.io

Problem

Research questions and friction points this paper is trying to address.

Generates high-quality 3D compositions from complex prompts

Decomposes scenes into structured components and relationships

Improves object disentanglement and control in 3D generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian splatting for 3D composition generation

Vision-Language Models for scene decomposition

Progressive optimization for object refinement

🔎 Similar Papers

No similar papers found.