🤖 AI Summary
In vertical federated learning (VFL), sharing raw confidence scores exposes participants to feature inference attacks. To address this, we propose PRIVEE—a privacy-preserving, order- and distance-preserving confidence obfuscation mechanism. PRIVEE is the first method to jointly guarantee relative ranking consistency, inter-class distance discernibility, and strong privacy in VFL. It employs a differentiable order-preserving mapping for score transformation, incorporates distance-sensitive perturbation modeling, and formulates a unified privacy-utility optimization framework. Evaluated on multiple benchmark datasets, PRIVEE achieves zero accuracy loss while reducing feature reconstruction attack success rates to near-random levels (≤12.5%). Its privacy protection capability surpasses state-of-the-art methods by a factor of three, effectively breaking the “usable but invisible” security bottleneck inherent in existing VFL protocols.
📝 Abstract
Vertical Federated Learning (VFL) enables collaborative model training across organizations that share common user samples but hold disjoint feature spaces. Despite its potential, VFL is susceptible to feature inference attacks, in which adversarial parties exploit shared confidence scores (i.e., prediction probabilities) during inference to reconstruct private input features of other participants. To counter this threat, we propose PRIVEE (PRIvacy-preserving Vertical fEderated lEarning), a novel defense mechanism named after the French word privée, meaning "private." PRIVEE obfuscates confidence scores while preserving critical properties such as relative ranking and inter-score distances. Rather than exposing raw scores, PRIVEE shares only the transformed representations, mitigating the risk of reconstruction attacks without degrading model prediction accuracy. Extensive experiments show that PRIVEE achieves a threefold improvement in privacy protection compared to state-of-the-art defenses, while preserving full predictive performance against advanced feature inference attacks.