🤖 AI Summary
Multi-party collusion poses a novel backdoor threat in vertical federated learning (VFL), where labels reside exclusively on the server while features are distributed across clients. Existing attacks rely on server-side gradient access, limiting practicality and stealth.
Method: We propose the first server-gradient-free, collaborative, decentralized backdoor attack framework for VFL. It innovatively integrates: (1) an adversarial graph topology consensus-based data selection mechanism; (2) cross-client trigger partitioning with intensity-weighted embedding; and (3) a metric-learning-enhanced variational autoencoder for localized label inference. Theoretical analysis establishes convergence guarantees via stability-gap bounds.
Results: Experiments demonstrate that our method achieves significantly higher attack success rates than state-of-the-art approaches—without accessing server gradients—while incurring notably smaller main-task accuracy degradation. This work provides the first empirical validation that multi-party collusion positively enhances backdoor effectiveness in VFL.
📝 Abstract
Federated learning (FL) is vulnerable to backdoor attacks, where adversaries alter model behavior on target classification labels by embedding triggers into data samples. While these attacks have received considerable attention in horizontal FL, they are less understood for vertical FL (VFL), where devices hold different features of the samples, and only the server holds the labels. In this work, we propose a novel backdoor attack on VFL which (i) does not rely on gradient information from the server and (ii) considers potential collusion among multiple adversaries for sample selection and trigger embedding. Our label inference model augments variational autoencoders with metric learning, which adversaries can train locally. A consensus process over the adversary graph topology determines which datapoints to poison. We further propose methods for trigger splitting across the adversaries, with an intensity-based implantation scheme skewing the server towards the trigger. Our convergence analysis reveals the impact of backdoor perturbations on VFL indicated by a stationarity gap for the trained model, which we verify empirically as well. We conduct experiments comparing our attack with recent backdoor VFL approaches, finding that ours obtains significantly higher success rates for the same main task performance despite not using server information. Additionally, our results verify the impact of collusion on attack performance.