🤖 AI Summary
Existing LiDAR generation methods struggle to simultaneously preserve fine-grained geometric details and ensure global topological consistency; moreover, mainstream diffusion models—relying on point-cloud embeddings in latent space—compromise geometric interpretability and structural fidelity. To address this, we propose a topology-aware graph diffusion framework: (i) we introduce persistent homology—particularly 0-dimensional homology—for the first time into LiDAR generation, enforcing topological regularization to maintain scene connectivity; (ii) we design a topology-preserving VAE-based graph representation learning module that integrates graph convolutional networks with latent-space diffusion modeling, enhancing both geometric consistency and semantic interpretability. Evaluated on KITTI-360, our method achieves state-of-the-art performance: FRID reduced by 22.6%, MMD decreased by 9.2%, and inference speed of 1.68 samples/second—demonstrating high fidelity, strong topological robustness, and efficient scalability.
📝 Abstract
LiDAR scene generation is critical for mitigating real-world LiDAR data collection costs and enhancing the robustness of downstream perception tasks in autonomous driving. However, existing methods commonly struggle to capture geometric realism and global topological consistency. Recent LiDAR Diffusion Models (LiDMs) predominantly embed LiDAR points into the latent space for improved generation efficiency, which limits their interpretable ability to model detailed geometric structures and preserve global topological consistency. To address these challenges, we propose TopoLiDM, a novel framework that integrates graph neural networks (GNNs) with diffusion models under topological regularization for high-fidelity LiDAR generation. Our approach first trains a topological-preserving VAE to extract latent graph representations by graph construction and multiple graph convolutional layers. Then we freeze the VAE and generate novel latent topological graphs through the latent diffusion models. We also introduce 0-dimensional persistent homology (PH) constraints, ensuring the generated LiDAR scenes adhere to real-world global topological structures. Extensive experiments on the KITTI-360 dataset demonstrate TopoLiDM's superiority over state-of-the-art methods, achieving improvements of 22.6% lower Frechet Range Image Distance (FRID) and 9.2% lower Minimum Matching Distance (MMD). Notably, our model also enables fast generation speed with an average inference time of 1.68 samples/s, showcasing its scalability for real-world applications. We will release the related codes at https://github.com/IRMVLab/TopoLiDM.