SceneLinker: Compositional 3D Scene Generation via Semantic Scene Graph from RGB Sequences

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Existing methods struggle to simultaneously model contextual relationships among objects and shape diversity, often resulting in distorted 3D scene layouts. To address this limitation, this work proposes a compositional 3D scene generation framework grounded in semantic scene graphs. The approach first constructs a semantic scene graph from an RGB image sequence, then employs a graph neural network enhanced with a cross-verified feature attention mechanism to predict scene structure. Furthermore, a graph variational autoencoder (Graph-VAE) is designed to jointly generate object shapes and layouts by integrating shape and layout priors. Evaluated on the 3RScan/3DSSG and SG-FRONT datasets, the method significantly outperforms existing approaches, generating semantically consistent and structurally plausible 3D scenes even in complex indoor environments and under strong constraints, thereby enabling high-quality personalized mixed reality content creation.

Technology Category

Application Category

📝 Abstract

We introduce SceneLinker, a novel framework that generates compositional 3D scenes via semantic scene graph from RGB sequences. To adaptively experience Mixed Reality (MR) content based on each user's space, it is essential to generate a 3D scene that reflects the real-world layout by compactly capturing the semantic cues of the surroundings. Prior works struggled to fully capture the contextual relationship between objects or mainly focused on synthesizing diverse shapes, making it challenging to generate 3D scenes aligned with object arrangements. We address these challenges by designing a graph network with cross-check feature attention for scene graph prediction and constructing a graph-variational autoencoder (graph-VAE), which consists of a joint shape and layout block for 3D scene generation. Experiments on the 3RScan/3DSSG and SG-FRONT datasets demonstrate that our approach outperforms state-of-the-art methods in both quantitative and qualitative evaluations, even in complex indoor environments and under challenging scene graph constraints. Our work enables users to generate consistent 3D spaces from their physical environments via scene graphs, allowing them to create spatial MR content. Project page is https://scenelinker2026.github.io.

Problem

Research questions and friction points this paper is trying to address.

compositional 3D scene generation

semantic scene graph

object arrangement

contextual relationship

3D scene alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

semantic scene graph

graph-VAE

compositional 3D scene generation

cross-check feature attention

RGB-to-3D

🔎 Similar Papers

Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View

2024-04-04arXiv.orgCitations: 3