🤖 AI Summary
To address high communication overhead, complex decoding, or reliance on strong algebraic assumptions (e.g., PGR) in frequency estimation under Local Differential Privacy (LDP), this paper proposes the Modular Subset Selection (MSS) algorithm based on the Residue Number System (RNS). MSS integrates RNS encoding, randomized residual perturbation, and a subset selection mechanism: users transmit only a small number of modular residues, drastically reducing communication cost; the server employs lightweight LSMR iterative solving—avoiding computationally expensive dynamic programming decoding—thus improving efficiency and resilience against reconstruction attacks. Experiments demonstrate that MSS achieves estimation accuracy comparable to Subset Selection (SS) and PGR, incurs lower communication overhead than SS, decodes significantly faster than PGR, and exhibits the lowest success rate under reconstruction attacks.
📝 Abstract
We present extsf{ModularSubsetSelection} (MSS), a new algorithm for locally differentially private (LDP) frequency estimation. Given a universe of size $k$ and $n$ users, our $varepsilon$-LDP mechanism encodes each input via a Residue Number System (RNS) over $ell$ pairwise-coprime moduli $m_0, ldots, m_{ell-1}$, and reports a randomly chosen index $j in [ell]$ along with the perturbed residue using the statistically optimal extsf{SubsetSelection}~(SS) (Wang et al. 2016). This design reduces the user communication cost from $Thetaigl(omega log_2(k/omega)igr)$ bits required by standard SS (with $omega approx k/(e^varepsilon+1)$) down to $lceil log_2 ell
ceil + lceil log_2 m_j
ceil$ bits, where $m_j<k$. Server-side decoding runs in $Theta(n + r k ell)$ time, where $r$ is the number of LSMR (Fong and Saunders 2011) iterations. In practice, with well-conditioned moduli ( extit{i.e.}, constant $r$ and $ell = Theta(log k)$), this becomes $Theta(n + k log k)$. We prove that MSS achieves worst-case MSE within a constant factor of state-of-the-art protocols such as SS and extsf{ProjectiveGeometryResponse} (PGR) (Feldman et al. 2022), while avoiding the algebraic prerequisites and dynamic-programming decoder required by PGR. Empirically, MSS matches the estimation accuracy of SS, PGR, and extsf{RAPPOR} (Erlingsson, Pihur, and Korolova 2014) across realistic $(k, varepsilon)$ settings, while offering faster decoding than PGR and shorter user messages than SS. Lastly, by sampling from multiple moduli and reporting only a single perturbed residue, MSS achieves the lowest reconstruction-attack success rate among all evaluated LDP protocols.