🤖 AI Summary
To address the low joint compression efficiency of geometry and attributes in dynamic point cloud video (PCV), this paper proposes U-Inter: a U-shaped multi-scale motion estimation and compensation framework. U-Inter introduces a novel collaborative mechanism integrating top-down motion propagation with bottom-up motion prediction coding, coupled with multi-scale group compensation and spatiotemporal prediction modules to explicitly model inter-frame motion and eliminate residual redundancy. By unifying optical flow estimation, multi-scale feature pyramids, and residual entropy coding, it achieves end-to-end joint compression of geometry and attribute data. Evaluated on the MPEG standard test dataset, U-Inter achieves an average 18.7% BD-rate reduction compared to G-PCC-GesTM v3.0 and state-of-the-art learning-based methods, demonstrating significant gains in compression performance.
📝 Abstract
Point cloud video (PCV) is a versatile 3D representation of dynamic scenes with emerging applications. This paper introduces U-Motion, a learning-based compression scheme for both PCV geometry and attributes. We propose a U-Structured inter-frame prediction framework, U-Inter, which performs explicit motion estimation and compensation (ME/MC) at different scales with varying levels of detail. It integrates Top-Down (Fine-to-Coarse) Motion Propagation, Bottom-Up Motion Predictive Coding and Multi-scale Group Motion Compensation to enable accurate motion estimation and efficient motion compression at each scale. In addition, we design a multi-scale spatial-temporal predictive coding module to capture the cross-scale spatial redundancy remaining after U-Inter prediction. We conduct experiments following the MPEG Common Test Condition for dense dynamic point clouds and demonstrate that U-Motion can achieve significant gains over MPEG G-PCC-GesTM v3.0 and recently published learning-based methods for both geometry and attribute compression.