🤖 AI Summary
To address challenges in unordered point cloud geometry compression—namely, difficulty in modeling neighborhood relationships and quantization distortion introduced by voxelization—this paper proposes a context-adaptive residual coding framework based on KNN neighborhood modeling. The method employs a two-tier encoder-decoder architecture: a non-learned base tier efficiently reconstructs global structure, while an INR-driven learnable refinement tier models local geometric context to recover fine details and enable arbitrary-density upsampling at the decoder. Key contributions include: (i) the first incorporation of content-aware local geometric context into raw point cloud compression; (ii) a low-complexity two-tier architectural design; and (iii) the first integration of implicit neural representations (INRs) into a compression system that jointly supports compression and multi-scale reconstruction. Experiments demonstrate that our method reduces encoding latency and model complexity by two orders of magnitude over SOTA methods, while maintaining superior rate-distortion performance and enabling flexible surface sampling.
📝 Abstract
Compressing a set of unordered points is far more challenging than compressing images/videos of regular sample grids, because of the difficulties in characterizing neighboring relations in an irregular layout of points. Many researchers resort to voxelization to introduce regularity, but this approach suffers from quantization loss. In this research, we use the KNN method to determine the neighborhoods of raw surface points. This gives us a means to determine the spatial context in which the latent features of 3D points are compressed by arithmetic coding. As such, the conditional probability model is adaptive to local geometry, leading to significant rate reduction. Additionally, we propose a dual-layer architecture where a non-learning base layer reconstructs the main structures of the point cloud at low complexity, while a learned refinement layer focuses on preserving fine details. This design leads to reductions in model complexity and coding latency by two orders of magnitude compared to SOTA methods. Moreover, we incorporate an implicit neural representation (INR) into the refinement layer, allowing the decoder to sample points on the underlying surface at arbitrary densities. This work is the first to effectively exploit content-aware local contexts for compressing irregular raw point clouds, achieving high rate-distortion performance, low complexity, and the ability to function as an arbitrary-scale upsampling network simultaneously.