Robust Privacy: Inference-Time Privacy through Certified Robustness

📅 2026-01-24

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work proposes a Robust Privacy (RP) framework that integrates certified robustness with privacy protection during model inference, addressing the risk that machine learning models may inadvertently leak sensitive attributes through their outputs. By enforcing prediction invariance within ℓ₂-norm neighborhoods of inputs, the framework ensures local stability and introduces an Attribute Privacy Enhancement (APE) mechanism to translate input-level invariance into provable attribute-level privacy guarantees. Experimental results on recommendation tasks demonstrate that RP substantially expands the compatibility interval for sensitive attributes. Under Gaussian noise with σ=0.1, the success rate of model inversion attacks drops from 73% to 4%, and even without any performance degradation, the attack success rate is reduced to 44%.

Technology Category

Application Category

📝 Abstract

Machine learning systems can produce personalized outputs that allow an adversary to infer sensitive input attributes at inference time. We introduce Robust Privacy (RP), an inference-time privacy notion inspired by certified robustness: if a model's prediction is provably invariant within a radius-$R$ neighborhood around an input $x$ (e.g., under the $\ell_2$ norm), then $x$ enjoys $R$-Robust Privacy, i.e., observing the prediction cannot distinguish $x$ from any input within distance $R$ of $x$. We further develop Attribute Privacy Enhancement (APE) to translate input-level invariance into an attribute-level privacy effect. In a controlled recommendation task where the decision depends primarily on a sensitive attribute, we show that RP expands the set of sensitive-attribute values compatible with a positive recommendation, expanding the inference interval accordingly. Finally, we empirically demonstrate that RP also mitigates model inversion attacks (MIAs) by masking fine-grained input-output dependence. Even at small noise levels ($\sigma=0.1$), RP reduces the attack success rate (ASR) from 73% to 4% with partial model performance degradation. RP can also partially mitigate MIAs (e.g., ASR drops to 44%) with no model performance degradation.

Problem

Research questions and friction points this paper is trying to address.

inference-time privacy

sensitive attribute inference

model inversion attacks

privacy leakage

adversarial inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Robust Privacy

Certified Robustness

Inference-Time Privacy

Model Inversion Attacks

Attribute Privacy Enhancement

🔎 Similar Papers

No similar papers found.

Authors to Follow