LADSG: Label-Anonymized Distillation and Similar Gradient Substitution for Label Privacy in Vertical Federated Learning

📅 2025-06-07

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

In vertical federated learning (VFL), internal passive label inference attacks can reconstruct private labels by exploiting gradients and semantic embeddings—even with only a few auxiliary labels—causing large-scale privacy leakage. Existing defenses typically target isolated leakage channels and thus fail against multi-source, composite attacks. To address this, we propose LADSG, a unified defense framework that jointly mitigates gradient-, embedding-, and label-level leakage without encryption or strong assumptions. LADSG integrates three novel components: (i) gradient semantic similarity substitution via similar-subspace projection, (ii) label distillation-based anonymization, and (iii) lightweight anomaly detection with privacy-utility trade-off optimization. Evaluated on six real-world datasets, LADSG reduces attack success rates by 30–60% while increasing computational overhead by less than 3%, significantly enhancing both robustness and practicality of VFL systems.

Technology Category

Application Category

📝 Abstract

Vertical federated learning (VFL) has become a key paradigm for collaborative machine learning, enabling multiple parties to train models over distributed feature spaces while preserving data privacy. Despite security protocols that defend against external attacks - such as gradient masking and encryption, which prevent unauthorized access to sensitive data - recent label inference attacks from within the system have emerged. These attacks exploit gradients and semantic embeddings to reconstruct private labels, bypassing traditional defenses. For example, the passive label inference attack can reconstruct tens of thousands of participants' private data using just 40 auxiliary labels, posing a significant security threat. Existing defenses address single leakage pathways, such as gradient leakage or label exposure. As attack strategies evolve, their limitations become clear, especially against hybrid attacks that combine multiple vectors. To address this, we propose Label-Anonymized Defense with Substitution Gradient (LADSG), a unified defense framework that integrates gradient substitution, label anonymization, and anomaly detection. LADSG mitigates both gradient and label leakage while maintaining the scalability and efficiency of VFL. Experiments on six real-world datasets show that LADSG reduces label inference attack success rates by 30-60%, with minimal computational overhead, underscoring the importance of lightweight defenses in securing VFL.

Problem

Research questions and friction points this paper is trying to address.

Preventing label inference attacks in vertical federated learning

Addressing gradient and label leakage simultaneously

Maintaining VFL scalability while enhancing label privacy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Label anonymization to protect private data

Gradient substitution to prevent leakage

Anomaly detection for enhanced security

🔎 Similar Papers

FedDTG: Federated Data-Free Knowledge Distillation via Three-Player Generative Adversarial Networks