PVINet: Point-Voxel Interlaced Network for Point Cloud Compression

📅 2025-08-31

📈 Citations: 0

✨ Influential: 0

career value

264K/year

🤖 AI Summary

To address insufficient interaction between global structure and local contextual information in point cloud compression, this paper proposes the Point-Voxel Interleaved Network (PVIN). PVIN employs parallel point and voxel encoders—respectively modeling local geometry and capturing global structure—and introduces conditional sparse convolution for the first time: point embeddings dynamically generate voxel convolutional kernels to enable bidirectional, multi-scale feature interaction between points and voxels. At the decoder, a point-embedding-guided sparse reconstruction mechanism enhances geometric fidelity. Experiments on standard point cloud compression benchmarks (e.g., MPEG PCGCv2) demonstrate that PVIN achieves state-of-the-art or competitive BD-rate performance, significantly improving reconstruction quality—particularly in preserving fine-grained textures and topological structures.

Technology Category

Application Category

📝 Abstract

In point cloud compression, the quality of a reconstructed point cloud relies on both the global structure and the local context, with existing methods usually processing global and local information sequentially and lacking communication between these two types of information. In this paper, we propose a point-voxel interlaced network (PVINet), which captures global structural features and local contextual features in parallel and performs interactions at each scale to enhance feature perception efficiency. Specifically, PVINet contains a voxel-based encoder (Ev) for extracting global structural features and a point-based encoder (Ep) that models local contexts centered at each voxel. Particularly, a novel conditional sparse convolution is introduced, which applies point embeddings to dynamically customize kernels for voxel feature extraction, facilitating feature interactions from Ep to Ev. During decoding, a voxel-based decoder employs conditional sparse convolutions to incorporate point embeddings as guidance to reconstruct the point cloud. Experiments on benchmark datasets show that PVINet delivers competitive performance compared to state-of-the-art methods.

Problem

Research questions and friction points this paper is trying to address.

Enhancing global and local feature interaction in point cloud compression

Improving reconstructed point cloud quality through parallel feature extraction

Addressing lack of communication between structural and contextual information

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel global-local feature extraction

Conditional sparse convolution customization

Point-voxel interlaced interaction enhancement

🔎 Similar Papers

No similar papers found.