LEA: Label Enumeration Attack in Vertical Federated Learning

📅 2026-03-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the vulnerability of label privacy in vertical federated learning (VFL), where existing label inference attacks often rely on scenario-specific assumptions or auxiliary data, limiting their practicality. The authors propose a generic label enumeration attack that requires no auxiliary information and is applicable across diverse VFL settings. By clustering enumerated sample-to-label mappings and evaluating the cosine similarity between the loss gradients of simulated and benign models in the first training round, the method accurately recovers private labels. To enhance efficiency, they further introduce the Binary-LEA algorithm, which reduces the enumeration complexity from $n!$ to $n^3$. Experimental results demonstrate that the attack achieves high accuracy while exhibiting strong robustness against common defenses such as gradient noise injection and compression.

Technology Category

Application Category

📝 Abstract
A typical Vertical Federated Learning (VFL) scenario involves several participants collaboratively training a machine learning model, where each party has different features for the same samples, with labels held exclusively by one party. Since labels contain sensitive information, VFL must ensure the privacy of labels. However, existing VFL-targeted label inference attacks are either limited to specific scenarios or require auxiliary data, rendering them impractical in real-world applications. We introduce a novel Label Enumeration Attack (LEA) that, for the first time, achieves applicability across multiple VFL scenarios and eschews the need for auxiliary data. Our intuition is that an adversary, employing clustering to enumerate mappings between samples and labels, ascertains the accurate label mappings by evaluating the similarity between the benign model and the simulated models trained under each mapping. To achieve that, the first challenge is how to measure model similarity, as models trained on the same data can have different weights. Drawing from our findings, we propose an efficient approach for assessing congruence based on the cosine similarity of the first-round loss gradients, which offers superior efficiency and precision compared to the comparison of parameter similarities. However, the computational cost may be prohibitive due to the necessity of training and comparing the vast number of simulated models generated through enumeration. To overcome this challenge, we propose Binary-LEA from the perspective of reducing the number of models and eliminating futile training, which lowers the number of enumerations from n! to n^3. Moreover, LEA is resilient against common defense mechanisms such as gradient noise and gradient compression.
Problem

Research questions and friction points this paper is trying to address.

Vertical Federated Learning
Label Privacy
Label Inference Attack
Privacy Attack
Federated Learning Security
Innovation

Methods, ideas, or system contributions that make the work stand out.

Label Enumeration Attack
Vertical Federated Learning
Model Similarity
Gradient Cosine Similarity
Privacy Attack
🔎 Similar Papers
No similar papers found.
Wenhao Jiang
Wenhao Jiang
GML, Tencent, PolyU
Computer VisionMachine LearningFoundation Models
S
Shaojing Fu
College of Computer Science and Technology, National University of Defense Technology, Changsha, China
Y
Yuchuan Luo
College of Computer Science and Technology, National University of Defense Technology, Changsha, China
L
Lin Liu
College of Computer Science and Technology, National University of Defense Technology, Changsha, China