Accelerating Sparse Convolutions in Voxel-Based Point Cloud Networks

📅 2025-11-25

📈 Citations: 0

✨ Influential: 0

career value

254K/year

🤖 AI Summary

Existing sparse convolution (SpC) engines fail to exploit the integer-valued, spatially bounded, and geometrically continuous properties of voxel coordinates, resulting in high preprocessing and postprocessing overhead for kernel mapping construction. This paper introduces Spira, the first GPU-accelerated sparse convolution engine specifically designed for voxelized point clouds. Spira incorporates four key innovations: (1) a one-pass search algorithm for kernel neighbor enumeration, (2) packed native coordinate access to eliminate redundant memory operations, (3) adaptive dual-dataflow execution to balance compute and memory throughput, and (4) fully concurrent kernel graph construction across the entire network. Experiments demonstrate that Spira achieves 1.71× average end-to-end inference speedup (up to 2.31×) and 2.13× average per-layer sparse convolution acceleration (up to 3.32×) over state-of-the-art baselines. It delivers efficient and scalable sparse computation for representative autonomous driving and AR/VR models.

Technology Category

Application Category

📝 Abstract

Sparse Convolution (SpC) powers 3D point cloud networks widely used in autonomous driving and AR/VR. SpC builds a kernel map that stores mappings between input voxel coordinates, output coordinates, and weight offsets, then uses this map to compute feature vectors for output coordinates. Our work identifies three key properties of voxel coordinates: they are integer-valued, bounded within a limited spatial range, and geometrically continuous-neighboring voxels on the same object surface are highly likely to exist at small spatial offsets from each other. Prior SpC engines do not fully exploit these properties and suffer from high pre-processing and post-processing overheads during kernel map construction. To address this, we design Spira, the first voxel-property-aware SpC engine for GPUs. Spira proposes: (i) a high-performance one-shot search algorithm that builds the kernel map with no preprocessing and high memory locality, (ii) an effective packed-native processing scheme that accesses packed voxel coordinates at low cost, (iii) a flexible dual-dataflow execution mechanism that efficiently computes output feature vectors by adapting to layer characteristics, and (iv) a network-wide parallelization strategy that builds kernel maps for all SpC layers concurrently at network start. Our evaluation shows that Spira significantly outperforms prior SpC engines by 1.71x on average and up to 2.31x for end-to-end inference, and by 2.13x on average and up to 3.32x for layer-wise execution across diverse layer configurations.

Problem

Research questions and friction points this paper is trying to address.

Optimizing sparse convolution for voxel-based 3D point cloud networks

Reducing preprocessing and postprocessing overheads in kernel map construction

Exploiting integer, bounded, and continuous properties of voxel coordinates

Innovation

Methods, ideas, or system contributions that make the work stand out.

One-shot search algorithm builds kernel map efficiently

Packed-native processing scheme reduces voxel access cost

Dual-dataflow execution adapts to layer characteristics

🔎 Similar Papers

No similar papers found.

Nvidia

184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5

US, CA, Santa Clara / Remote - US

Authors to Follow