🤖 AI Summary
This work addresses a previously unexplored privacy threat in Federated Graph Learning (FGL): servers inferring clients’ label distributions from uploaded GNN models. We propose the first Label Distribution Inference Attack (LDA) based on GNN embedding compression. Methodologically, we uncover an intrinsic link between node embedding variance and label distribution inferability; integrate embedding compression, statistical similarity modeling, and distribution alignment optimization; and jointly evaluate inference quality via cosine similarity and Jensen–Shannon divergence. Evaluated on six benchmark datasets—including CoraFull and LastFM—our attack significantly outperforms state-of-the-art methods under both metrics and exhibits robustness against differential privacy defenses. Our core contributions are threefold: (i) introducing embedding compression into the LDA framework for FGL; (ii) establishing a theoretical variance–accuracy relationship governing inference precision; and (iii) providing a novel analytical perspective and a strong baseline for assessing label-distribution leakage in FGL.
📝 Abstract
Graph Neural Networks (GNNs) have been widely used for graph analysis. Federated Graph Learning (FGL) is an emerging learning framework to collaboratively train graph data from various clients. However, since clients are required to upload model parameters to the server in each round, this provides the server with an opportunity to infer each client's data privacy. In this paper, we focus on label distribution attacks(LDAs) that aim to infer the label distributions of the clients' local data. We take the first step to attack client's label distributions in FGL. Firstly, we observe that the effectiveness of LDA is closely related to the variance of node embeddings in GNNs. Next, we analyze the relation between them and we propose a new attack named EC-LDA, which significantly improves the attack effectiveness by compressing node embeddings. Thirdly, extensive experiments on node classification and link prediction tasks across six widely used graph datasets show that EC-LDA outperforms the SOTA LDAs. For example, EC-LDA attains optimal values under both Cos-sim and JS-div evaluation metrics in the CoraFull and LastFM datasets. Finally, we explore the robustness of EC-LDA under differential privacy protection.