SceneExpander: Expanding 3D Scenes with Free-Form Inserted Views

📅 2026-03-27

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work addresses the challenge of multi-view inconsistency in user-guided, free-form 3D scene expansion, where generative models often produce novel views misaligned geometrically with the original reconstruction. To tackle this, the authors propose SceneExpander, the first method to introduce a dual distillation mechanism for this task: anchor distillation preserves the original scene structure, while insertion-view self-distillation adapts the latent geometry and appearance to align coherently with newly added views. The approach enables test-time adaptation of parameterized feed-forward 3D reconstruction models. Experiments on ETH and in-the-wild data demonstrate that SceneExpander significantly improves scene expansion quality and maintains strong multi-view consistency even under substantial view misalignment.

Technology Category

Application Category

📝 Abstract

World building with 3D scene representations is increasingly important for content creation, simulation, and interactive experiences, yet real workflows are inherently iterative: creators must repeatedly extend an existing scene under user control. Motivated by this research gap, we study 3D scene expansion in a user-centric workflow: starting from a real scene captured by multi-view images, we extend its coverage by inserting an additional view synthesized by a generative model. Unlike simple object editing or style transfer in a fixed scene, the inserted view is often 3D-misaligned with the original reconstruction, introducing geometry shifts, hallucinated content, or view-dependent artifacts that break global multi-view consistency. To address the challenge, we propose SceneExpander, which applies test-time adaptation to a parametric feed-forward 3D reconstruction model with two complementary distillation signals: anchor distillation stabilizes the original scene by distilling geometric cues from the captured views, while inserted-view self-distillation preserves observation-supported predictions yet adapts latent geometry and appearance to accommodate the misaligned inserted view. Experiments on ETH scenes and online data demonstrate improved expansion behavior and reconstruction quality under misalignment.

Problem

Research questions and friction points this paper is trying to address.

3D scene expansion

multi-view consistency

view synthesis

geometric misalignment

user-centric workflow

Innovation

Methods, ideas, or system contributions that make the work stand out.

3D scene expansion

test-time adaptation

distillation signals