FlowSSC: Universal Generative Monocular Semantic Scene Completion via One-Step Latent Diffusion

📅 2026-01-21

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Monocular semantic scene completion suffers from inherent ambiguities in 3D geometry and semantics due to occlusions. To address this challenge, this work introduces a generative model into the task for the first time, proposing a universal conditional generation framework based on a triplane latent representation. The framework incorporates a novel Shortcut Flow-matching mechanism that enables single-step, high-fidelity generation and seamlessly integrates with existing feedforward approaches for joint optimization. Evaluated on SemanticKITTI, the method achieves state-of-the-art performance while delivering high-quality outputs and real-time inference capabilities, making it well-suited for practical applications such as autonomous driving.

Technology Category

Application Category

📝 Abstract

Semantic Scene Completion (SSC) from monocular RGB images is a fundamental yet challenging task due to the inherent ambiguity of inferring occluded 3D geometry from a single view. While feed-forward methods have made progress, they often struggle to generate plausible details in occluded regions and preserve the fundamental spatial relationships of objects. Such accurate generative reasoning capability for the entire 3D space is critical in real-world applications. In this paper, we present FlowSSC, the first generative framework applied directly to monocular semantic scene completion. FlowSSC treats the SSC task as a conditional generation problem and can seamlessly integrate with existing feed-forward SSC methods to significantly boost their performance. To achieve real-time inference without compromising quality, we introduce Shortcut Flow-matching that operates in a compact triplane latent space. Unlike standard diffusion models that require hundreds of steps, our method utilizes a shortcut mechanism to achieve high-fidelity generation in a single step, enabling practical deployment in autonomous systems. Extensive experiments on SemanticKITTI demonstrate that FlowSSC achieves state-of-the-art performance, significantly outperforming existing baselines.

Problem

Research questions and friction points this paper is trying to address.

Semantic Scene Completion

Monocular RGB

3D Geometry

Occluded Regions

Spatial Relationships

Innovation

Methods, ideas, or system contributions that make the work stand out.

FlowSSC

monocular semantic scene completion

latent diffusion