Semantic-aware DropSplat: Adaptive Pruning of Redundant Gaussians for 3D Aerial-View Segmentation

๐Ÿ“… 2025-08-13
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address semantic ambiguity in 3D aerial scenes caused by scale variation and structural occlusion, this paper proposes a semantics-aware adaptive pruning method for Gaussian points. The method employs Gaussian point clouds as the geometric representation and introduces two key innovations: (1) a learnable sparsification mechanism based on the Hard Concrete distribution, jointly optimized with a semantic confidence estimation module to dynamically remove redundant Gaussian points; and (2) a weakly supervised optimization paradigm leveraging high-confidence pseudo-labels generated by a pre-trained 2D foundation model. By preserving semantic consistency while significantly enhancing representational compactness, the approach achieves superior segmentation accuracy and inference efficiency under limited annotations. Experiments on our newly constructed 3D-AS dataset demonstrate substantial improvements in both precision and computational efficiency, enabling high-fidelity, compact semantic understanding of complex 3D aerial scenes.

Technology Category

Application Category

๐Ÿ“ Abstract
In the task of 3D Aerial-view Scene Semantic Segmentation (3D-AVS-SS), traditional methods struggle to address semantic ambiguity caused by scale variations and structural occlusions in aerial images. This limits their segmentation accuracy and consistency. To tackle these challenges, we propose a novel 3D-AVS-SS approach named SAD-Splat. Our method introduces a Gaussian point drop module, which integrates semantic confidence estimation with a learnable sparsity mechanism based on the Hard Concrete distribution. This module effectively eliminates redundant and semantically ambiguous Gaussian points, enhancing both segmentation performance and representation compactness. Furthermore, SAD-Splat incorporates a high-confidence pseudo-label generation pipeline. It leverages 2D foundation models to enhance supervision when ground-truth labels are limited, thereby further improving segmentation accuracy. To advance research in this domain, we introduce a challenging benchmark dataset: 3D Aerial Semantic (3D-AS), which encompasses diverse real-world aerial scenes with sparse annotations. Experimental results demonstrate that SAD-Splat achieves an excellent balance between segmentation accuracy and representation compactness. It offers an efficient and scalable solution for 3D aerial scene understanding.
Problem

Research questions and friction points this paper is trying to address.

Addresses semantic ambiguity in 3D aerial segmentation
Eliminates redundant Gaussian points for compact representation
Improves segmentation accuracy with limited ground-truth labels
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic-aware Gaussian pruning with Hard Concrete
High-confidence pseudo-labels from 2D foundation models
Novel 3D aerial benchmark dataset with sparse annotations
๐Ÿ”Ž Similar Papers
No similar papers found.
X
Xu Tang
School of Artificial Intelligence, Xidian University, Xiโ€™an, China
J
Junan Jia
School of Artificial Intelligence, Xidian University, Xiโ€™an, China
Y
Yijing Wang
School of Artificial Intelligence, Xidian University, Xiโ€™an, China
J
Jingjing Ma
School of Artificial Intelligence, Xidian University, Xiโ€™an, China
Xiangrong Zhang
Xiangrong Zhang
Professor, Xidian University
Image processing and understandingpattern recognitionmachine learning