LADSG: Label-Anonymized Distillation and Similar Gradient Substitution for Label Privacy in Vertical Federated Learning

📅 2025-06-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In vertical federated learning (VFL), internal passive label inference attacks can reconstruct private labels by exploiting gradients and semantic embeddings—even with only a few auxiliary labels—causing large-scale privacy leakage. Existing defenses typically target isolated leakage channels and thus fail against multi-source, composite attacks. To address this, we propose LADSG, a unified defense framework that jointly mitigates gradient-, embedding-, and label-level leakage without encryption or strong assumptions. LADSG integrates three novel components: (i) gradient semantic similarity substitution via similar-subspace projection, (ii) label distillation-based anonymization, and (iii) lightweight anomaly detection with privacy-utility trade-off optimization. Evaluated on six real-world datasets, LADSG reduces attack success rates by 30–60% while increasing computational overhead by less than 3%, significantly enhancing both robustness and practicality of VFL systems.

Technology Category

Application Category

📝 Abstract
Vertical federated learning (VFL) has become a key paradigm for collaborative machine learning, enabling multiple parties to train models over distributed feature spaces while preserving data privacy. Despite security protocols that defend against external attacks - such as gradient masking and encryption, which prevent unauthorized access to sensitive data - recent label inference attacks from within the system have emerged. These attacks exploit gradients and semantic embeddings to reconstruct private labels, bypassing traditional defenses. For example, the passive label inference attack can reconstruct tens of thousands of participants' private data using just 40 auxiliary labels, posing a significant security threat. Existing defenses address single leakage pathways, such as gradient leakage or label exposure. As attack strategies evolve, their limitations become clear, especially against hybrid attacks that combine multiple vectors. To address this, we propose Label-Anonymized Defense with Substitution Gradient (LADSG), a unified defense framework that integrates gradient substitution, label anonymization, and anomaly detection. LADSG mitigates both gradient and label leakage while maintaining the scalability and efficiency of VFL. Experiments on six real-world datasets show that LADSG reduces label inference attack success rates by 30-60%, with minimal computational overhead, underscoring the importance of lightweight defenses in securing VFL.
Problem

Research questions and friction points this paper is trying to address.

Preventing label inference attacks in vertical federated learning
Addressing gradient and label leakage simultaneously
Maintaining VFL scalability while enhancing label privacy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Label anonymization to protect private data
Gradient substitution to prevent leakage
Anomaly detection for enhanced security
🔎 Similar Papers
No similar papers found.
Zeyu Yan
Zeyu Yan
PhD, Computer Science, University of Maryland, College Park
Human Computer InteractionDigital FabricationSustainabilityInteraction systemsAccessibility
Y
Yifei Yao
School of Cyber Engineering, Xidian University, Xi’an 710126, China
X
Xuanbing Wen
School of Cyber Engineering, Xidian University, Xi’an 710126, China
J
Juli Zhang
School of Computer Science and Technology, Xidian University, Xi’an 710126, Shaanxi, China
Kai Fan
Kai Fan
ByteDance
Machine learningBayesian Deep LearningMachine translationLLMs