🤖 AI Summary
This paper addresses spectral graph clustering under edge differential privacy (edge-DP), balancing privacy preservation, clustering accuracy, and computational efficiency. To this end, we propose three synergistic mechanisms: (1) graph structure perturbation via randomized edge flipping and adjacency matrix shuffling; (2) private graph projection into a low-dimensional space with calibrated Gaussian noise; and (3) a private power iteration algorithm incorporating iterative distributed noise injection. We theoretically establish that the proposed framework satisfies $(varepsilon,delta)$-edge-DP and derive an error bound that vanishes asymptotically with increasing node count. Extensive experiments on synthetic and real-world networks demonstrate that our method significantly outperforms existing baselines—achieving high community detection accuracy even under stringent privacy budgets. The framework thus provides a scalable, analytically grounded paradigm for privacy-preserving clustering of large-scale graph data.
📝 Abstract
We study the problem of spectral graph clustering under edge differential privacy (DP). Specifically, we develop three mechanisms: (i) graph perturbation via randomized edge flipping combined with adjacency matrix shuffling, which enforces edge privacy while preserving key spectral properties of the graph. Importantly, shuffling considerably amplifies the guarantees: whereas flipping edges with a fixed probability alone provides only a constant epsilon edge DP guarantee as the number of nodes grows, the shuffled mechanism achieves (epsilon, delta) edge DP with parameters that tend to zero as the number of nodes increase; (ii) private graph projection with additive Gaussian noise in a lower-dimensional space to reduce dimensionality and computational complexity; and (iii) a noisy power iteration method that distributes Gaussian noise across iterations to ensure edge DP while maintaining convergence. Our analysis provides rigorous privacy guarantees and a precise characterization of the misclassification error rate. Experiments on synthetic and real-world networks validate our theoretical analysis and illustrate the practical privacy-utility trade-offs.