🤖 AI Summary
This work addresses the underexplored risk of identity leakage in frozen vision encoders trained on facial data and the absence of practical privacy-preserving mechanisms. We introduce the first privacy auditing framework tailored for non-face-recognition encoders, featuring an adversary-calibrated evaluation protocol, and propose a single-step linear projection (ISP) method that removes biometric information by eliminating the identity subspace. Experiments reveal that CLIP exhibits substantially higher identity leakage compared to DINOv2/v3 and SSCD. Applying ISP reduces linear probe-based identity recognition performance to near-random levels while preserving high utility for non-biometric semantic tasks, demonstrating strong cross-dataset generalization on both CelebA-20 and VGGFace2.
📝 Abstract
Frozen visual embeddings (e.g., CLIP, DINOv2/v3, SSCD) power retrieval and integrity systems, yet their use on face-containing data is constrained by unmeasured identity leakage and a lack of deployable mitigations. We take an attacker-aware view and contribute: (i) a benchmark of visual embeddings that reports open-set verification at low false-accept rates, a calibrated diffusion-based template inversion check, and face-context attribution with equal-area perturbations; and (ii) propose a one-shot linear projector that removes an estimated identity subspace while preserving the complementary space needed for utility, which for brevity we denote as the identity sanitization projection ISP. Across CelebA-20 and VGGFace2, we show that these encoders are robust under open-set linear probes, with CLIP exhibiting relatively higher leakage than DINOv2/v3 and SSCD, robust to template inversion, and are context-dominant. In addition, we show that ISP drives linear access to near-chance while retaining high non-biometric utility, and transfers across datasets with minor degradation. Our results establish the first attacker-calibrated facial privacy audit of non-FR encoders and demonstrate that linear subspace removal achieves strong privacy guarantees while preserving utility for visual search and retrieval.