🤖 AI Summary
Existing neural rendering methods (e.g., NeRF, 3DGS) face three key bottlenecks in autonomous driving simulation: slow rendering speed, exclusive support for pinhole cameras, and difficulty unifying multi-sensor modeling (e.g., fisheye cameras and rotating LiDAR), leading to cross-modal inconsistency. To address these, we propose a factorized 3D Gaussian representation with an anchoring strategy—enabling, for the first time, real-time joint rendering of asymmetric camera models and rotating LiDAR. Built upon the 3DGUT framework, our approach integrates automatic spatial chunking, ray culling, and extended Gaussian parameterization to significantly improve both efficiency and geometric consistency. Evaluated on multiple autonomous driving datasets, our method matches or surpasses state-of-the-art performance in both camera-view reconstruction and LiDAR point-cloud reconstruction, while accelerating rendering by 10–20×. This work establishes a new paradigm for high-fidelity, multi-sensor-coherent simulation in autonomous driving.
📝 Abstract
Rigorous testing of autonomous robots, such as self-driving vehicles, is essential to ensure their safety in real-world deployments. This requires building high-fidelity simulators to test scenarios beyond those that can be safely or exhaustively collected in the real-world. Existing neural rendering methods based on NeRF and 3DGS hold promise but suffer from low rendering speeds or can only render pinhole camera models, hindering their suitability to applications that commonly require high-distortion lenses and LiDAR data. Multi-sensor simulation poses additional challenges as existing methods handle cross-sensor inconsistencies by favoring the quality of one modality at the expense of others. To overcome these limitations, we propose SimULi, the first method capable of rendering arbitrary camera models and LiDAR data in real-time. Our method extends 3DGUT, which natively supports complex camera models, with LiDAR support, via an automated tiling strategy for arbitrary spinning LiDAR models and ray-based culling. To address cross-sensor inconsistencies, we design a factorized 3D Gaussian representation and anchoring strategy that reduces mean camera and depth error by up to 40% compared to existing methods. SimULi renders 10-20x faster than ray tracing approaches and 1.5-10x faster than prior rasterization-based work (and handles a wider range of camera models). When evaluated on two widely benchmarked autonomous driving datasets, SimULi matches or exceeds the fidelity of existing state-of-the-art methods across numerous camera and LiDAR metrics.