Differentially Private Relational Learning with Entity-level Privacy Guarantees

📅 2025-06-10

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work addresses relational learning on attributed graphs under entity-level differential privacy (DP), tackling two key challenges: (i) high and ill-defined global sensitivity arising from entities participating in multiple relations, and (ii) the breakdown of classical privacy amplification analysis due to multi-stage coupled sampling. To resolve these, we first establish the first formal theoretical framework for entity-level DP in relational settings. We propose a frequency-aware adaptive gradient clipping mechanism that tightly bounds sensitivity based on entity occurrence counts. Furthermore, we generalize the privacy amplification theorem to a tractable subclass where sample sizes are coupled across stages. Empirically, fine-tuning encoder models on text-attributed networks demonstrates substantial improvements in the privacy–utility trade-off. Our open-source implementation validates the method’s effectiveness, rigorous DP guarantees, and scalability on real-world relational datasets.

Technology Category

Application Category

📝 Abstract

Learning with relational and network-structured data is increasingly vital in sensitive domains where protecting the privacy of individual entities is paramount. Differential Privacy (DP) offers a principled approach for quantifying privacy risks, with DP-SGD emerging as a standard mechanism for private model training. However, directly applying DP-SGD to relational learning is challenging due to two key factors: (i) entities often participate in multiple relations, resulting in high and difficult-to-control sensitivity; and (ii) relational learning typically involves multi-stage, potentially coupled (interdependent) sampling procedures that make standard privacy amplification analyses inapplicable. This work presents a principled framework for relational learning with formal entity-level DP guarantees. We provide a rigorous sensitivity analysis and introduce an adaptive gradient clipping scheme that modulates clipping thresholds based on entity occurrence frequency. We also extend the privacy amplification results to a tractable subclass of coupled sampling, where the dependence arises only through sample sizes. These contributions lead to a tailored DP-SGD variant for relational data with provable privacy guarantees. Experiments on fine-tuning text encoders over text-attributed network-structured relational data demonstrate the strong utility-privacy trade-offs of our approach. Our code is available at https://github.com/Graph-COM/Node_DP.

Problem

Research questions and friction points this paper is trying to address.

Ensures entity-level privacy in relational learning with differential privacy

Addresses high sensitivity from multi-relation entity participation

Extends privacy guarantees to coupled sampling procedures in relational data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive gradient clipping based on entity frequency

Extended privacy amplification for coupled sampling

Tailored DP-SGD for relational data guarantees

🔎 Similar Papers

Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions