rareboost3d: a synthetic lidar dataset with enhanced rare classes

📅 2025-10-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the long-tailed distribution problem in real-world LiDAR point cloud data—caused by severe scarcity of rare-class samples—this paper proposes RareBoost3D, the first synthetic 3D point cloud dataset explicitly designed for long-tail mitigation. We further introduce a Cross-domain Semantic Consistency (CSC) loss to enforce fine-grained feature alignment between synthetic and real domains. Crucially, our method operates without requiring ground-truth annotations for rare classes: it controllably generates high-fidelity synthetic point clouds densely populated with rare objects (e.g., traffic cones, bicycles), while the CSC loss regularizes semantic space distributions across domains. This significantly enhances model generalization to tail classes. On benchmarks including SemanticKITTI, our approach achieves an average 12.7% improvement in mIoU for rare classes, demonstrating both effectiveness and practicality.

Technology Category

Application Category

📝 Abstract
Real-world point cloud datasets have made significant contributions to the development of LiDAR-based perception technologies, such as object segmentation for autonomous driving. However, due to the limited number of instances in some rare classes, the long-tail problem remains a major challenge in existing datasets. To address this issue, we introduce a novel, synthetic point cloud dataset named RareBoost3D, which complements existing real-world datasets by providing significantly more instances for object classes that are rare in real-world datasets. To effectively leverage both synthetic and real-world data, we further propose a cross-domain semantic alignment method named CSC loss that aligns feature representations of the same class across different domains. Experimental results demonstrate that this alignment significantly enhances the performance of LiDAR point cloud segmentation models over real-world data.
Problem

Research questions and friction points this paper is trying to address.

Addresses long-tail class imbalance in LiDAR datasets
Enhances rare object instances using synthetic data
Improves cross-domain segmentation via feature alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic dataset enhances rare object classes
Cross-domain alignment method aligns feature representations
Improves LiDAR point cloud segmentation performance
🔎 Similar Papers
No similar papers found.