DreamAnywhere: Object-Centric Panoramic 3D Scene Generation

📅 2025-06-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing text-to-3D scene generation methods suffer from limited single-view conditioning, low visual fidelity, weak scene understanding, and narrow applicability—typically restricted to either indoor or outdoor settings. This paper introduces the first text-driven 3D scene generation framework tailored for panoramic 360° environments. Our method employs background-object disentangled modeling, 2D-mask-guided 3D mesh refinement, hybrid inpainting-based completion, and neural rendering to realize an end-to-end, customizable pipeline. It enables object-level 3D placement, fine-grained editing, and immersive navigation while preserving omnidirectional geometric and appearance consistency. Quantitative evaluation and user studies demonstrate superior visual fidelity and layout plausibility over state-of-the-art approaches. The framework significantly improves novel-view synthesis coherence and image quality, making it suitable for cost-effective film production and rapid scene prototyping.

Technology Category

Application Category

📝 Abstract
Recent advances in text-to-3D scene generation have demonstrated significant potential to transform content creation across multiple industries. Although the research community has made impressive progress in addressing the challenges of this complex task, existing methods often generate environments that are only front-facing, lack visual fidelity, exhibit limited scene understanding, and are typically fine-tuned for either indoor or outdoor settings. In this work, we address these issues and propose DreamAnywhere, a modular system for the fast generation and prototyping of 3D scenes. Our system synthesizes a 360° panoramic image from text, decomposes it into background and objects, constructs a complete 3D representation through hybrid inpainting, and lifts object masks to detailed 3D objects that are placed in the virtual environment. DreamAnywhere supports immersive navigation and intuitive object-level editing, making it ideal for scene exploration, visual mock-ups, and rapid prototyping -- all with minimal manual modeling. These features make our system particularly suitable for low-budget movie production, enabling quick iteration on scene layout and visual tone without the overhead of traditional 3D workflows. Our modular pipeline is highly customizable as it allows components to be replaced independently. Compared to current state-of-the-art text and image-based 3D scene generation approaches, DreamAnywhere shows significant improvements in coherence in novel view synthesis and achieves competitive image quality, demonstrating its effectiveness across diverse and challenging scenarios. A comprehensive user study demonstrates a clear preference for our method over existing approaches, validating both its technical robustness and practical usefulness.
Problem

Research questions and friction points this paper is trying to address.

Generates 360-degree panoramic 3D scenes from text
Improves visual fidelity and scene coherence in 3D generation
Enables intuitive object-level editing and rapid prototyping
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates 360° panoramic images from text
Decomposes scenes into background and objects
Supports immersive navigation and object editing
🔎 Similar Papers
No similar papers found.