U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences

📅 2025-12-02

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Existing LiDAR-based 4D dynamic scene modeling approaches apply uniform spatial processing, neglecting scene-specific uncertainty variations—leading to frequent artifacts in complex or ambiguous regions and degrading geometric fidelity and temporal stability. To address this, we propose an uncertainty-aware 4D dynamic environment modeling framework. First, we leverage a pre-trained segmentation model to generate a spatial uncertainty map. Second, we introduce a “hard-to-easy” two-stage diffusion-based reconstruction paradigm that performs uncertainty-guided geometric completion. Third, we design a Mixture-of-Spatio-Temporal (MoST) fusion module to adaptively aggregate spatio-temporal features, enhancing consistency across frames. Experiments on multiple benchmarks demonstrate significant improvements in geometric detail and inter-frame coherence. Our method yields more realistic and temporally stable 4D LiDAR sequences, advancing perception and simulation capabilities for autonomous driving systems.

Technology Category

Application Category

📝 Abstract

Modeling dynamic 3D environments from LiDAR sequences is central to building reliable 4D worlds for autonomous driving and embodied AI. Existing generative frameworks, however, often treat all spatial regions uniformly, overlooking the varying uncertainty across real-world scenes. This uniform generation leads to artifacts in complex or ambiguous regions, limiting realism and temporal stability. In this work, we present U4D, an uncertainty-aware framework for 4D LiDAR world modeling. Our approach first estimates spatial uncertainty maps from a pretrained segmentation model to localize semantically challenging regions. It then performs generation in a"hard-to-easy"manner through two sequential stages: (1) uncertainty-region modeling, which reconstructs high-entropy regions with fine geometric fidelity, and (2) uncertainty-conditioned completion, which synthesizes the remaining areas under learned structural priors. To further ensure temporal coherence, U4D incorporates a mixture of spatio-temporal (MoST) block that adaptively fuses spatial and temporal representations during diffusion. Extensive experiments show that U4D produces geometrically faithful and temporally consistent LiDAR sequences, advancing the reliability of 4D world modeling for autonomous perception and simulation.

Problem

Research questions and friction points this paper is trying to address.

Models dynamic 3D environments from LiDAR sequences for autonomous driving

Addresses uniform generation causing artifacts in complex or ambiguous regions

Ensures temporal coherence in 4D world modeling for reliability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Estimates spatial uncertainty maps from segmentation model

Performs generation in hard-to-easy two-stage manner

Incorporates mixture of spatio-temporal block for temporal coherence

🔎 Similar Papers

Interactive4D: Interactive 4D LiDAR Segmentation