ROI-Guided Point Cloud Geometry Compression Towards Human and Machine Vision

📅 2024-10-28

🏛️ ACM Multimedia

📈 Citations: 1

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Point cloud geometric compression faces a fundamental trade-off between high compression ratios and maintaining downstream task accuracy. To address this, we propose the first ROI-guided dual-branch neural compression framework: a base branch reconstructs the global point cloud structure, while a detail branch employs an ROI prediction network to generate spatial masks that guide mask-weighted residual encoding and rate-distortion (RD) optimization. Crucially, we embed ROI semantic priors end-to-end into the entire compression pipeline and introduce a novel joint RD-detection loss function, enabling simultaneous, end-to-end optimization of compression efficiency and object detection performance. Evaluated on ScanNet and SUN RGB-D, our method achieves a 10% improvement in detection accuracy over state-of-the-art learned compression approaches, while delivering superior rate-fidelity trade-offs.

Technology Category

Application Category

📝 Abstract

Point cloud data is pivotal in applications like autonomous driving, virtual reality, and robotics. However, its substantial volume poses significant challenges in storage and transmission. In order to obtain a high compression ratio, crucial semantic details usually confront severe damage, leading to difficulties in guaranteeing the accuracy of downstream tasks. To tackle this problem, we are the first to introduce a novel Region of Interest (ROI)-guided Point Cloud Geometry Compression (RPCGC) method for human and machine vision. Our framework employs a dual-branch parallel structure, where the base layer encodes and decodes a simplified version of the point cloud, and the enhancement layer refines this by focusing on geometry details. Furthermore, the residual information of the enhancement layer undergoes refinement through an ROI prediction network. This network generates mask information, which is then incorporated into the residuals, serving as a strong supervision signal. Additionally, we intricately apply these mask details in the Rate-Distortion (RD) optimization process, with each point weighted in the distortion calculation. Our loss function includes RD loss and detection loss to better guide point cloud encoding for the machine. Experiment results demonstrate that RPCGC achieves exceptional compression performance and better detection accuracy (10% gain) than some learning-based compression methods at high bitrates in ScanNet and SUN RGB-D datasets.

Problem

Research questions and friction points this paper is trying to address.

Compress point cloud data efficiently for storage and transmission

Preserve crucial semantic details for downstream task accuracy

Enhance compression for both human and machine vision needs

Innovation

Methods, ideas, or system contributions that make the work stand out.

ROI-guided dual-branch compression for human-machine vision

ROI prediction network refines residual geometry details

Rate-Distortion optimization with weighted point masking

🔎 Similar Papers

No similar papers found.