🤖 AI Summary
Existing 3D point cloud convolution methods face a fundamental trade-off between geometric fidelity—achieved by point-based approaches—and computational efficiency—offered by voxel-based ones—thereby limiting performance in tasks such as registration. To address this, we propose PointConv, the first general-purpose convolutional architecture operating natively on unstructured point sets. PointConv introduces a point-centric receptive field and a novel matrix-vector multiply-accumulate reduction (MVMR) computation paradigm, implemented via an efficient GPU kernel. By eliminating voxel quantization, it avoids geometric distortion while preserving high spatial fidelity. Experiments demonstrate that PointConv reduces memory consumption by an order of magnitude and accelerates inference several-fold compared to representative point-based baselines. When replacing voxel-based backbones, it achieves significantly higher registration accuracy with faster inference, establishing a new Pareto-optimal balance between geometric precision and computational efficiency.
📝 Abstract
Existing convolutional learning methods for 3D point cloud data are divided into two paradigms: point-based methods that preserve geometric precision but often face performance challenges, and voxel-based methods that achieve high efficiency through quantization at the cost of geometric fidelity. This loss of precision is a critical bottleneck for tasks such as point cloud registration. We propose PointCNN++, a novel architectural design that fundamentally mitigates this precision-performance trade-off. It extbf{generalizes sparse convolution from voxels to points}, treating voxel-based convolution as a specialized, degraded case of our more general point-based convolution. First, we introduce a point-centric convolution where the receptive field is centered on the original, high-precision point coordinates. Second, to make this high-fidelity operation performant, we design a computational strategy that operates extbf{natively} on points. We formulate the convolution on native points as a Matrix-Vector Multiplication and Reduction (MVMR) problem, for which we develop a dedicated, highly-optimized GPU kernel. Experiments demonstrate that PointCNN++ extbf{uses an order of magnitude less memory and is several times faster} than representative point-based methods. Furthermore, when used as a simple replacement for the voxel-based backbones it generalizes, it extbf{significantly improves point cloud registration accuracies while proving both more memory-efficient and faster}. PointCNN++ shows that preserving geometric detail and achieving high performance are not mutually exclusive, paving the way for a new class of 3D learning with high fidelity and efficiency. Our code will be open sourced.