Locally Private Estimation with Public Features

📅 2024-05-22
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates semi-feature privacy under local differential privacy (LDP), where a subset of features is publicly released while the remaining features and labels must satisfy LDP constraints. We formally introduce the “semi-feature LDP” model—the first principled framework deviating from conventional full-feature perturbation. For nonparametric regression, we propose HistOfTree, an estimator that integrates histogram-based partitioning with adaptive tree-structured feature splitting, augmented by a data-driven hyperparameter selection strategy. We establish its minimax-optimal convergence rate, strictly improving upon existing LDP lower bounds for analogous problems. Extensive experiments on synthetic and real-world datasets demonstrate consistent and significant performance gains over state-of-the-art methods. Our core contributions unify conceptual modeling innovation, algorithmic design, and theoretical advancement—establishing both a new privacy paradigm and provably optimal estimation under semi-feature LDP.

Technology Category

Application Category

📝 Abstract
We initiate the study of locally differentially private (LDP) learning with public features. We define semi-feature LDP, where some features are publicly available while the remaining ones, along with the label, require protection under local differential privacy. Under semi-feature LDP, we demonstrate that the mini-max convergence rate for non-parametric regression is significantly reduced compared to that of classical LDP. Then we propose HistOfTree, an estimator that fully leverages the information contained in both public and private features. Theoretically, HistOfTree reaches the mini-max optimal convergence rate. Empirically, HistOfTree achieves superior performance on both synthetic and real data. We also explore scenarios where users have the flexibility to select features for protection manually. In such cases, we propose an estimator and a data-driven parameter tuning strategy, leading to analogous theoretical and empirical results.
Problem

Research questions and friction points this paper is trying to address.

Study local differential privacy with public features
Propose HistOfTree for optimal convergence in LDP
Explore manual feature selection for privacy protection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semi-feature local differential privacy
HistOfTree estimator development
Data-driven parameter tuning strategy
🔎 Similar Papers
No similar papers found.
Y
Yuheng Ma
School of Statistics, Renmin University of China
K
Ke Jia
School of Statistics, Renmin University of China
Hanfang Yang
Hanfang Yang
Assistant professor of statistics, School of Statistics, Renmin university of China