PVINet: Point-Voxel Interlaced Network for Point Cloud Compression

📅 2025-08-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient interaction between global structure and local contextual information in point cloud compression, this paper proposes the Point-Voxel Interleaved Network (PVIN). PVIN employs parallel point and voxel encoders—respectively modeling local geometry and capturing global structure—and introduces conditional sparse convolution for the first time: point embeddings dynamically generate voxel convolutional kernels to enable bidirectional, multi-scale feature interaction between points and voxels. At the decoder, a point-embedding-guided sparse reconstruction mechanism enhances geometric fidelity. Experiments on standard point cloud compression benchmarks (e.g., MPEG PCGCv2) demonstrate that PVIN achieves state-of-the-art or competitive BD-rate performance, significantly improving reconstruction quality—particularly in preserving fine-grained textures and topological structures.

Technology Category

Application Category

📝 Abstract
In point cloud compression, the quality of a reconstructed point cloud relies on both the global structure and the local context, with existing methods usually processing global and local information sequentially and lacking communication between these two types of information. In this paper, we propose a point-voxel interlaced network (PVINet), which captures global structural features and local contextual features in parallel and performs interactions at each scale to enhance feature perception efficiency. Specifically, PVINet contains a voxel-based encoder (Ev) for extracting global structural features and a point-based encoder (Ep) that models local contexts centered at each voxel. Particularly, a novel conditional sparse convolution is introduced, which applies point embeddings to dynamically customize kernels for voxel feature extraction, facilitating feature interactions from Ep to Ev. During decoding, a voxel-based decoder employs conditional sparse convolutions to incorporate point embeddings as guidance to reconstruct the point cloud. Experiments on benchmark datasets show that PVINet delivers competitive performance compared to state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Enhancing global and local feature interaction in point cloud compression
Improving reconstructed point cloud quality through parallel feature extraction
Addressing lack of communication between structural and contextual information
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel global-local feature extraction
Conditional sparse convolution customization
Point-voxel interlaced interaction enhancement
🔎 Similar Papers
No similar papers found.
X
Xuan Deng
School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China, and also with the Peng Cheng Laboratory, Shenzhen 519055, China
X
Xingtao Wang
School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
Xiandong Meng
Xiandong Meng
University of California Davis
Natural Language Processing LLM Deep Learning
Xiaopeng Fan
Xiaopeng Fan
Professor, Harbin Institute of Technology
Video/ImageWireless
Debin Zhao
Debin Zhao
Dept. of Computer Science,Harbin Institute of Technology
Video codingImage and Video ProcessingData Compression