🤖 AI Summary
To address insufficient interaction between global structure and local contextual information in point cloud compression, this paper proposes the Point-Voxel Interleaved Network (PVIN). PVIN employs parallel point and voxel encoders—respectively modeling local geometry and capturing global structure—and introduces conditional sparse convolution for the first time: point embeddings dynamically generate voxel convolutional kernels to enable bidirectional, multi-scale feature interaction between points and voxels. At the decoder, a point-embedding-guided sparse reconstruction mechanism enhances geometric fidelity. Experiments on standard point cloud compression benchmarks (e.g., MPEG PCGCv2) demonstrate that PVIN achieves state-of-the-art or competitive BD-rate performance, significantly improving reconstruction quality—particularly in preserving fine-grained textures and topological structures.
📝 Abstract
In point cloud compression, the quality of a reconstructed point cloud relies on both the global structure and the local context, with existing methods usually processing global and local information sequentially and lacking communication between these two types of information. In this paper, we propose a point-voxel interlaced network (PVINet), which captures global structural features and local contextual features in parallel and performs interactions at each scale to enhance feature perception efficiency. Specifically, PVINet contains a voxel-based encoder (Ev) for extracting global structural features and a point-based encoder (Ep) that models local contexts centered at each voxel. Particularly, a novel conditional sparse convolution is introduced, which applies point embeddings to dynamically customize kernels for voxel feature extraction, facilitating feature interactions from Ep to Ev. During decoding, a voxel-based decoder employs conditional sparse convolutions to incorporate point embeddings as guidance to reconstruct the point cloud. Experiments on benchmark datasets show that PVINet delivers competitive performance compared to state-of-the-art methods.