🤖 AI Summary
Existing 3D datasets are predominantly task-specific, lacking systematic data infrastructure and theoretical foundations for multi-task learning. To address this, we introduce WHU-Synthetic—the first large-scale synthetic perception dataset explicitly designed for 3D multi-task learning. Built upon a high-fidelity simulation engine, it uniformly supports and aligns diverse subtasks including data augmentation, scene segmentation, place recognition, and 3D reconstruction. Crucially, it achieves intrinsic alignment of multi-granularity tasks within a shared environmental domain, enabling city-scale sampling, variable-density point cloud generation, and temporal evolution simulation. All annotations—spanning depth completion, point cloud upsampling, instance segmentation, and pose estimation—are harmonized under a unified coordinate system and semantic taxonomy. Experimental results demonstrate that cross-task collaboration improves 3D reconstruction accuracy by up to 12.7%, uncovering novel patterns of task conflict and synergy. The dataset is publicly released and has been widely adopted in the research community.
📝 Abstract
End-to-end models capable of handling multiple subtasks in parallel have become a new trend, thereby presenting significant challenges and opportunities for the integration of multiple tasks within the domain of 3-D vision. The limitations of 3-D data acquisition conditions have not only restricted the exploration of many innovative research problems but have also caused existing 3-D datasets to predominantly focus on single tasks. This has resulted in a lack of systematic approaches and theoretical frameworks for 3-D multitask learning, with most efforts merely serving as auxiliary support to the primary task. In this article, we introduce WHU-Synthetic, a large-scale 3-D synthetic perception dataset designed for multitask learning, from the initial data augmentation (upsampling and depth completion), through scene understanding (segmentation), to macrolevel tasks (place recognition and 3-D reconstruction). Collected in the same environmental domain, we ensure inherent alignment across subtasks to construct multitask models without separate training methods. In addition, we implement several novel settings, making it possible to realize certain ideas that are difficult to achieve in real-world scenarios. This supports more adaptive and robust multitask perception tasks, such as sampling on city-level models, providing point clouds with different densities, and simulating temporal changes. Using our dataset, we conduct several experiments to investigate mutual benefits between subtasks, revealing new observations, challenges, and opportunities for future research. The dataset is accessible at: https://github.com/WHU-USI3DV/WHU-Synthetic.