🤖 AI Summary
Existing point cloud sampling methods lack task awareness, often introducing redundant points, resulting in low efficiency and requiring post-processing. This paper proposes a contribution-driven learnable sampling network that formulates sampling as a differentiable Top-k selection, jointly optimizing for task relevance and point uniqueness. Key contributions include: (1) the first differentiable Top-k approximation based on entropy-regularized optimal transport; and (2) a novel architecture integrating spatial pooling embedding, cascaded offset attention, and contribution scoring modules for end-to-end semantic-aware optimization. Evaluated on ModelNet40 and PU147, our method achieves state-of-the-art performance across classification, registration, compression, and surface reconstruction tasks—demonstrating significant improvements in both accuracy and computational efficiency.
📝 Abstract
Point cloud sampling plays a crucial role in reducing computation costs and storage requirements for various vision tasks. Traditional sampling methods, such as farthest point sampling, lack task-specific information and, as a result, cannot guarantee optimal performance in specific applications. Learning-based methods train a network to sample the point cloud for the targeted downstream task. However, they do not guarantee that the sampled points are the most relevant ones. Moreover, they may result in duplicate sampled points, which requires completion of the sampled point cloud through post-processing techniques. To address these limitations, we propose a contribution-based sampling network (CS-Net), where the sampling operation is formulated as a Top-k operation. To ensure that the network can be trained in an end-to-end way using gradient descent algorithms, we use a differentiable approximation to the Top-k operation via entropy regularization of an optimal transport problem. Our network consists of a feature embedding module, a cascade attention module, and a contribution scoring module. The feature embedding module includes a specifically designed spatial pooling layer to reduce parameters while preserving important features. The cascade attention module combines the outputs of three skip connected offset attention layers to emphasize the attractive features and suppress less important ones. The contribution scoring module generates a contribution score for each point and guides the sampling process to prioritize the most important ones. Experiments on the ModelNet40 and PU147 showed that CS-Net achieved state-of-the-art performance in two semantic-based downstream tasks (classification and registration) and two reconstruction-based tasks (compression and surface reconstruction).