MOSAIC: Generating Consistent, Privacy-Preserving Scenes from Multiple Depth Views in Multi-Room Environments

📅 2025-03-18

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses privacy-preserving digital twin reconstruction for multi-room indoor scenes, proposing a reconstruction method that relies solely on multi-view depth maps—requiring no RGB input or additional training. Methodologically, it introduces a novel inference-time optimization framework that jointly enforces multi-view overlap-aware scene alignment and implicit geometric consistency, incorporating probabilistic cross-view dependency modeling; we theoretically prove that increasing overlapping views reduces denoising variance. Built upon diffusion models, the approach enables unsupervised, end-to-end consistent reconstruction. On multi-room environment reconstruction, it significantly outperforms state-of-the-art methods: FID improves by 23.6% and LPIPS by 19.4%. Crucially, the original depth data remains fully protected, achieving a principled balance between geometric fidelity and image quality.

Technology Category

Application Category

📝 Abstract

We introduce a novel diffusion-based approach for generating privacy-preserving digital twins of multi-room indoor environments from depth images only. Central to our approach is a novel Multi-view Overlapped Scene Alignment with Implicit Consistency (MOSAIC) model that explicitly considers cross-view dependencies within the same scene in the probabilistic sense. MOSAIC operates through a novel inference-time optimization that avoids error accumulation common in sequential or single-room constraint in panorama-based approaches. MOSAIC scales to complex scenes with zero extra training and provably reduces the variance during denoising processes when more overlapping views are added, leading to improved generation quality. Experiments show that MOSAIC outperforms state-of-the-art baselines on image fidelity metrics in reconstructing complex multi-room environments. Project page is available at: https://mosaic-cmubig.github.io

Problem

Research questions and friction points this paper is trying to address.

Generates privacy-preserving digital twins from depth images.

Ensures cross-view consistency in multi-room environments.

Improves scene reconstruction quality with overlapping views.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion-based approach for privacy-preserving digital twins

Multi-view Overlapped Scene Alignment with Implicit Consistency

Inference-time optimization reducing error accumulation

🔎 Similar Papers

HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model