SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model

📅 2025-12-11

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

To address geometric distortion and inaccurate pose estimation caused by severe occlusion in open-set scenarios, this paper proposes a decoupled 3D scene generation framework that separates occlusion removal from 3D object generation to mitigate their mutual interference. We design a global-local attention mechanism integrating self-attention and cross-attention to enhance pose robustness, and introduce OpenScene3D—the first large-scale synthetic dataset tailored for open-set 3D scene composition. Furthermore, we propose a multi-scale unified pose estimation model. Our method is jointly trained on RGB images and dedicated deocclusion supervision signals, achieving significant improvements in geometric completeness and pose accuracy across diverse indoor and outdoor open-set scenes. It consistently outperforms state-of-the-art methods. The code and OpenScene3D dataset are publicly released.

Technology Category

Application Category

📝 Abstract

We propose a decoupled 3D scene generation framework called SceneMaker in this work. Due to the lack of sufficient open-set de-occlusion and pose estimation priors, existing methods struggle to simultaneously produce high-quality geometry and accurate poses under severe occlusion and open-set settings. To address these issues, we first decouple the de-occlusion model from 3D object generation, and enhance it by leveraging image datasets and collected de-occlusion datasets for much more diverse open-set occlusion patterns. Then, we propose a unified pose estimation model that integrates global and local mechanisms for both self-attention and cross-attention to improve accuracy. Besides, we construct an open-set 3D scene dataset to further extend the generalization of the pose estimation model. Comprehensive experiments demonstrate the superiority of our decoupled framework on both indoor and open-set scenes. Our codes and datasets is released at https://idea-research.github.io/SceneMaker/.

Problem

Research questions and friction points this paper is trying to address.

Addresses open-set 3D scene generation with decoupled de-occlusion and pose estimation.

Improves geometry quality and pose accuracy under severe occlusion conditions.

Enhances generalization using diverse datasets and unified attention mechanisms.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoupled de-occlusion model using diverse open-set occlusion patterns

Unified pose estimation model integrating global and local attention mechanisms

Open-set 3D scene dataset construction to enhance model generalization

🔎 Similar Papers

No similar papers found.