🤖 AI Summary
This work addresses the fundamental trade-off between privacy protection and data utility in 2D/3D vision tasks. We propose ROAR, a novel privacy scrubbing framework introducing the “object erasure” paradigm: sensitive objects are precisely localized via instance segmentation (e.g., Mask R-CNN) and *completely removed*—not blurred—using generative inpainting (diffusion models or VAEs), thereby preserving scene structural integrity. ROAR unifies privacy scrubbing for both 2D object detection and 3D Neural Radiance Field (NeRF) reconstruction. Experiments on COCO show that ROAR achieves 87.5% of the baseline AP for 2D detection—outperforming image-discard baselines by 13.3 percentage points. For NeRF reconstruction, it incurs ≤1.66 dB PSNR degradation while improving LPIPS and maintaining SSIM near baseline levels. These results demonstrate ROAR’s ability to simultaneously ensure strong privacy guarantees and high visual fidelity across heterogeneous vision tasks.
📝 Abstract
We introduce ROAR (Robust Object Removal and Re-annotation), a scalable framework for privacy-preserving dataset obfuscation that eliminates sensitive objects instead of modifying them. Our method integrates instance segmentation with generative inpainting to remove identifiable entities while preserving scene integrity. Extensive evaluations on 2D COCO-based object detection show that ROAR achieves 87.5% of the baseline detection average precision (AP), whereas image dropping achieves only 74.2% of the baseline AP, highlighting the advantage of scrubbing in preserving dataset utility. The degradation is even more severe for small objects due to occlusion and loss of fine-grained details. Furthermore, in NeRF-based 3D reconstruction, our method incurs a PSNR loss of at most 1.66 dB while maintaining SSIM and improving LPIPS, demonstrating superior perceptual quality. Our findings establish object removal as an effective privacy framework, achieving strong privacy guarantees with minimal performance trade-offs. The results highlight key challenges in generative inpainting, occlusion-robust segmentation, and task-specific scrubbing, setting the foundation for future advancements in privacy-preserving vision systems.