🤖 AI Summary
This work addresses the challenge of designing disclosure mechanisms that strictly control worst-case privacy leakage when only useful data correlated with sensitive information is observed. The authors propose a sparse pointwise privacy leakage criterion and leverage information geometry to approximate mutual information in high-privacy regimes. The mechanism design is formulated as a sparse Rayleigh quotient maximization problem with an ℓ₀ constraint, focusing on the case of binary symmetric outputs. By introducing a worst-case privacy constraint, they reveal that for small output alphabets, the sparse optimal solution can be computed exactly, and identify a sparsity threshold beyond which the solution saturates at the unconstrained spectral value, rendering convex relaxations tight. Combining semidefinite programming (SDP) relaxation with combinatorial support enumeration, they develop a polynomial-time algorithm that efficiently approximates this NP-hard problem and characterize the theoretical trade-offs between sparsity and performance saturation.
📝 Abstract
We study an information-theoretic privacy mechanism design problem, where an agent observes useful data $Y$ that is arbitrarily correlated with sensitive data $X$, and design disclosed data $U$ generated from $Y$ (the agent has no direct access to $X$). We introduce \emph{sparse point-wise privacy leakage}, a worst-case privacy criterion that enforces two simultaneous constraints for every disclosed symbol $u\in\mathcal{U}$: (i) $u$ may be correlated with at most $N$ realizations of $X$, and (ii) the total leakage toward those realizations is bounded. In the high-privacy regime, we use concepts from information geometry to obtain a local quadratic approximation of mutual information which measures utility between $U$ and $Y$. When the leakage matrix $P_{X|Y}$ is invertible, this approximation reduces the design problem to a sparse quadratic maximization, known as the Rayleigh-quotient problem, with an $\ell_0$ constraint. We further show that, for the approximated problem, one can without loss of optimality restrict attention to a binary released variable $U$ with a uniform distribution. For small alphabet sizes, the exact sparsity-constrained optimum can be computed via combinatorial support enumeration, which quickly becomes intractable as the dimension grows. For general dimensions, the resulting sparse Rayleigh-quotient maximization is NP-hard and closely related to sparse principal component analysis (PCA). We propose a convex semidefinite programming (SDP) relaxation that is solvable in polynomial time and provides a tractable surrogate for the NP-hard design, together with a simple rounding procedure to recover a feasible leakage direction. We also identify a sparsity threshold beyond which the sparse optimum saturates at the unconstrained spectral value and the SDP relaxation becomes tight.