๐ค AI Summary
This work identifies a novel privacy threat in federated unlearning (FU): adversaries can infer labels of forgotten user data by analyzing model parameter changes. To address this, we propose the first systematic label inference attack (ULIA), which establishes a gradientโlabel mapping mechanism. ULIA models parameter differences and gradient dynamics to achieve high-accuracy label recovery across diverse unlearning levels (1%โ100%). It remains effective under both IID and non-IID data distributions, achieving 100% attack success rate in IID settings; even when only 1% of local data is unlearned, success rates range from 62.3% to 93%. Our findings fundamentally challenge the prevailing privacy-security assumptions underlying existing FU mechanisms. By exposing this vulnerability, we provide critical theoretical insights and an empirical benchmark for trustworthy evaluation and robust defense design in federated unlearning.
๐ Abstract
Federated Unlearning (FU) has emerged as a promising solution to respond to the right to be forgotten of clients, by allowing clients to erase their data from global models without compromising model performance. Unfortunately, researchers find that the parameter variations of models induced by FU expose clients' data information, enabling attackers to infer the label of unlearning data, while label inference attacks against FU remain unexplored. In this paper, we introduce and analyze a new privacy threat against FU and propose a novel label inference attack, ULIA, which can infer unlearning data labels across three FU levels. To address the unique challenges of inferring labels via the models variations, we design a gradient-label mapping mechanism in ULIA that establishes a relationship between gradient variations and unlearning labels, enabling inferring labels on accumulated model variations. We evaluate ULIA on both IID and non-IID settings. Experimental results show that in the IID setting, ULIA achieves a 100% Attack Success Rate (ASR) under both class-level and client-level unlearning. Even when only 1% of a user's local data is forgotten, ULIA still attains an ASR ranging from 93% to 62.3%.