🤖 AI Summary
Even when GPS metadata is removed, videos can still be geolocated through background visual cues, posing a significant threat to location privacy. This work proposes a dynamic conditional random field (DCRF)-guided selective perturbation mechanism that identifies sensitive background regions and adaptively adjusts perturbation intensity via normalized control penalty (NCP), injecting Gaussian noise only into critical areas. Evaluated with σ₀=8, the method reduces the Top-1 retrieval accuracy of ResNet18 from 0.667 to 0.361±0.127 while preserving a high visual fidelity of 36.14 dB PSNR—substantially outperforming global noise injection by approximately 6 dB. The approach thus effectively mitigates gallery-based retrieval attacks without compromising perceptual quality.
📝 Abstract
We propose PPEDCRF, a calibrated selective perturbation framework that protects \emph{background-based location privacy} in released video frames against gallery-based retrieval attackers. Even after GPS metadata are stripped, an adversary can geolocate a frame by matching its background visual cues to geo-tagged reference imagery; PPEDCRF mitigates this threat by estimating location-sensitive background regions with a dynamic conditional random field (DCRF), rescaling perturbation strength with a normalized control penalty (NCP), and injecting Gaussian noise only inside the inferred regions via a DP-style calibration rule.
On a controlled paired-scene retrieval benchmark with eight attacker backbones and three noise seeds, PPEDCRF reduces ResNet18 Top-1 retrieval accuracy from 0.667 to $0.361\pm0.127$ at $σ_0=8$ while preserving $36.14\,$dB PSNR -- an ${\approx}6\,$dB quality advantage over global Gaussian noise. Transfer across the eight-backbone seed-averaged benchmark is broadly supportive (23 of 24 backbone-gallery cells show negative $Δ$), while appendix-scale confirmation identifies MixVPR as a remaining adverse-transfer exception. Matched-operating-point analysis shows that PPEDCRF and global Gaussian noise converge in Top-1 privacy at equal utility, so the practical benefit is spatially concentrated perturbation that preserves higher visual quality at any given noise scale rather than stronger matched-utility privacy. Code: https://github.com/mabo1215/PPEDCRF