Node-private community estimation in stochastic block models: Tractable algorithms and lower bounds

📅 2026-05-15
📈 Citations: 0
Influential: 0
📄 PDF

career value

228K/year
🤖 AI Summary
This work investigates the problem of community recovery in the stochastic block model under node differential privacy. Addressing the challenge that node privacy causes the privacy budget ε to scale rapidly with graph size, the authors propose a polynomial-time computable algorithm that integrates the Lipschitz-extended exponential mechanism with a smooth projection framework mapping graphs into bounded-degree spaces, combined with spectral clustering, private PCA, and low-rank matrix estimation. Theoretical analysis establishes a tight lower bound on the required growth rate of ε and, for the first time, introduces the Hirschfeld–Gebelein–Rényi (HGR) maximal correlation into PAC learning to achieve accuracy amplification. Compared to straightforward adaptations of edge-private methods, this approach significantly improves utility while preserving statistical consistency and algorithmic stability.
📝 Abstract
We study the classical problem of community recovery in stochastic block models with a fixed number of communities, with a twist: We seek algorithms that are stable with respect to node-wise changes in the graph structure, formally defined as a differential privacy constraint. The algorithms we develop are based on spectral clustering, where we introduce privacy to the community recovery pipeline in the form of directly privatizing the adjacency matrix; private PCA; private convex optimization; private low-rank matrix estimation; and private approximate subspace estimation. Straightforward applications of existing private algorithms lead to a rapid increase in the privacy parameter $\epsilon$ in order to ensure consistent estimation under node differential privacy, in contrast with the simpler setting of edge privacy. To alleviate these issues, we develop novel algorithms based on (1) sampling from an exponential mechanism with a Lipschitz extension and (2) a general framework for constructing smooth projections from the space of undirected graphs to the space of bounded-degree graphs, which can then be combined with various edge-private algorithms. Importantly, the methods we develop are all computable in polynomial-time as a function of the number of nodes in the graph. We also develop novel lower bounds on the growth rate of $\epsilon$ required in order to achieve consistent community estimation under node privacy. On a technical note, our paper highlights the complications that arise when analyzing private algorithms under the non-standard scaling $\epsilon \rightarrow \infty$ and proposes some solutions. We also provide a novel application of the HGR maximal correlation from information theory in the context of accuracy amplification in PAC learning, which may be of independent interest.
Problem

Research questions and friction points this paper is trying to address.

node-private community estimation
stochastic block models
differential privacy
community recovery
consistent estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

node differential privacy
stochastic block model
private spectral clustering
Lipschitz extension
polynomial-time algorithms