🤖 AI Summary
It remains an open question whether probabilistic counters—widely used in privacy-preserving data aggregation—require additional noise injection to satisfy differential privacy.
Method: Grounded in the rigorous definition of ε-differential privacy, this work formally proves for the first time that the intrinsic randomness inherent in probabilistic counter structures (e.g., LogLog, HyperLogLog)—arising from their fundamental approximation mechanisms—is sufficient to guarantee ε-differential privacy without external noise. We design a reusable privacy mechanism leveraging only protocol-internal randomness, eliminating accuracy degradation caused by conventional Laplace or Gaussian noise. Our approach integrates differential privacy theory, probabilistic algorithm analysis, and distributed protocol engineering.
Contribution/Results: We establish the intrinsic privacy guarantee of probabilistic counters and propose a provably secure, high-accuracy, low-overhead aggregation protocol suitable for distributed surveys and similar applications—achieving practical privacy protection with zero extraneous randomization.
📝 Abstract
Probabilistic counters are well known tools often used for space-efficient set cardinality estimation. In this paper we investigate probabilistic counters from the perspective of preserving privacy. We use standard, rigid differential privacy notion. The intuition is that the probabilistic counters do not reveal too much information about individuals, but provide only general information about the population. Thus they can be used safely without violating privacy of individuals. It turned out however that providing a precise, formal analysis of privacy parameters of probabilistic counters is surprisingly difficult and needs advanced techniques and a very careful approach.
We demonstrate also that probabilistic counters can be used as a privacy protecion mechanism without any extra randomization. That is, the inherit randomization from the protocol is sufficient for protecting privacy, even if the probabilistic counter is used many times. In particular we present a specific privacy-preserving data aggregation protocol based on a probabilistic counter. Our results can be used for example in performing distributed surveys.