๐ค AI Summary
This work addresses the limitations of existing LiDAR ground segmentation methods, which often rely on handcrafted rules or costly point-level annotations and consequently suffer from poor generalization. The paper proposes the first self-supervised, domain-agnostic framework for ground segmentation that operates without any manual labeling and is applicable to arbitrary LiDAR sensors. Built upon the unified large-scale OmniLiDAR dataset, the approach integrates a runtime pseudo-labeling module (PseudoLabeler) with a data normalization strategy across 15 distinct LiDAR models to enable efficient unsupervised learning. The method achieves state-of-the-art performance on nuScenes, SemanticKITTI, and Waymo benchmarks while meeting real-time processing requirements. The model and code are publicly released.
๐ Abstract
LiDAR perception is fundamental to robotics, enabling machines to understand their environment in 3D. A crucial task for LiDAR-based scene understanding and navigation is ground segmentation. However, existing methods are either handcrafted for specific sensor configurations or rely on costly per-point manual labels, severely limiting their generalization and scalability. To overcome this, we introduce TerraSeg, the first self-supervised, domain-agnostic model for LiDAR ground segmentation. We train TerraSeg on OmniLiDAR, a unified large-scale dataset that aggregates and standardizes data from 12 major public benchmarks. Spanning almost 22 million raw scans across 15 distinct sensor models, OmniLiDAR provides unprecedented diversity for learning a highly generalizable ground model. To supervise training without human annotations, we propose PseudoLabeler, a novel module that generates high-quality ground and non-ground labels through self-supervised per-scan runtime optimization. Extensive evaluations demonstrate that, despite using no manual labels, TerraSeg achieves state-of-the-art results on nuScenes, SemanticKITTI, and Waymo Perception while delivering real-time performance. Our code and model weights are publicly available.