🤖 AI Summary
Machine learning services often inadvertently leak users’ private attributes (e.g., gender, race) during data collection, posing serious privacy risks. Existing adversarial training–based approaches for private attribute protection suffer from inherent fragility and struggle to simultaneously ensure privacy preservation and downstream task utility. To address this, we propose an information-theoretic, differentiable random sample replacement paradigm that explicitly models mutual information between private attributes and learned features. Our method introduces a stochastic, differentiable replacement mechanism coupled with a customized loss function to achieve strict statistical decoupling of private attributes from representations. Crucially, it avoids the instability and optimization challenges inherent in adversarial training and generalizes seamlessly across multimodal domains—including images, sensor signals, and speech. Experiments demonstrate that our approach reduces private attribute prediction accuracy by over 60%, while incurring less than 2% degradation in downstream task performance—substantially outperforming state-of-the-art methods.
📝 Abstract
The growing Machine Learning (ML) services require extensive collections of user data, which may inadvertently include people's private information irrelevant to the services. Various studies have been proposed to protect private attributes by removing them from the data while maintaining the utilities of the data for downstream tasks. Nevertheless, as we theoretically and empirically show in the paper, these methods reveal severe vulnerability because of a common weakness rooted in their adversarial training based strategies. To overcome this limitation, we propose a novel approach, PASS, designed to stochastically substitute the original sample with another one according to certain probabilities, which is trained with a novel loss function soundly derived from information-theoretic objective defined for utility-preserving private attributes protection. The comprehensive evaluation of PASS on various datasets of different modalities, including facial images, human activity sensory signals, and voice recording datasets, substantiates PASS's effectiveness and generalizability.