🤖 AI Summary
Existing LiDAR scene generation methods lack explicit control over foreground object layout and spatial semantic relationships, limiting their utility in autonomous driving simulation and safety verification. To address this, we propose a semantics-enhanced scene graph diffusion model, introducing the first layout-guided generation framework: it incorporates a relation-aware contextual conditioning mechanism and foreground control injection to enable fine-grained control over object positions and semantic relations. We construct the first large-scale LiDAR scene graph datasets—Waymo-SG and nuScenes-SG—and design dedicated layout evaluation metrics. Our method integrates spatial relational attention, semantic encoding, and foreground-aware control, achieving state-of-the-art performance in both generation fidelity and downstream perception tasks. It significantly improves structural plausibility and semantic consistency, establishing a new benchmark for controllable 3D scene generation.
📝 Abstract
Controllable generation of realistic LiDAR scenes is crucial for applications such as autonomous driving and robotics. While recent diffusion-based models achieve high-fidelity LiDAR generation, they lack explicit control over foreground objects and spatial relationships, limiting their usefulness for scenario simulation and safety validation. To address these limitations, we propose Large-scale Layout-guided LiDAR generation model ("La La LiDAR"), a novel layout-guided generative framework that introduces semantic-enhanced scene graph diffusion with relation-aware contextual conditioning for structured LiDAR layout generation, followed by foreground-aware control injection for complete scene generation. This enables customizable control over object placement while ensuring spatial and semantic consistency. To support our structured LiDAR generation, we introduce Waymo-SG and nuScenes-SG, two large-scale LiDAR scene graph datasets, along with new evaluation metrics for layout synthesis. Extensive experiments demonstrate that La La LiDAR achieves state-of-the-art performance in both LiDAR generation and downstream perception tasks, establishing a new benchmark for controllable 3D scene generation.