🤖 AI Summary
To address the limitations of manual hyperparameter tuning or complex network designs in DSM-to-DTM conversion, this work pioneers the application of diffusion models to this task, formulating ground surface extraction as a denoising process. We propose a gated confidence-guided mechanism and a Prior-Guided Stitching (PrioStitch) strategy to enhance local consistency and global scalability, coupled with multi-scale downsampled prior generation for efficient high-resolution terrain reconstruction. Evaluated on ALS2DTM and USGS benchmarks, our method reduces RMSE by 93% and 47%, respectively, and decreases road centerline distance error by 81%, while preserving superior surface smoothness. This work establishes the first end-to-end learnable, diffusion-based framework for remote sensing terrain modeling.
📝 Abstract
Digital Terrain Models (DTMs) represent the bare-earth elevation and are important in numerous geospatial applications. Such data models cannot be directly measured by sensors and are typically generated from Digital Surface Models (DSMs) derived from LiDAR or photogrammetry. Traditional filtering approaches rely on manually tuned parameters, while learning-based methods require well-designed architectures, often combined with post-processing. To address these challenges, we introduce Ground Diffusion (GrounDiff), the first diffusion-based framework that iteratively removes non-ground structures by formulating the problem as a denoising task. We incorporate a gated design with confidence-guided generation that enables selective filtering. To increase scalability, we further propose Prior-Guided Stitching (PrioStitch), which employs a downsampled global prior automatically generated using GrounDiff to guide local high-resolution predictions. We evaluate our method on the DSM-to-DTM translation task across diverse datasets, showing that GrounDiff consistently outperforms deep learning-based state-of-the-art methods, reducing RMSE by up to 93% on ALS2DTM and up to 47% on USGS benchmarks. In the task of road reconstruction, which requires both high precision and smoothness, our method achieves up to 81% lower distance error compared to specialized techniques on the GeRoD benchmark, while maintaining competitive surface smoothness using only DSM inputs, without task-specific optimization. Our variant for road reconstruction, GrounDiff+, is specifically designed to produce even smoother surfaces, further surpassing state-of-the-art methods. The project page is available at https://deepscenario.github.io/GrounDiff/.