Private Estimation and Inference in High-Dimensional Regression with FDR Control

📅 2023-10-25

📈 Citations: 3

✨ Influential: 1

career value

244K/year

🤖 AI Summary

This paper addresses the challenges of differentially private (DP) parameter estimation, statistical inference, and multiple testing control in high-dimensional linear regression. To this end, we propose a unified framework comprising three key components: (i) the first DP-protected Bayesian Information Criterion (BIC) for adaptive model sparsity selection; (ii) a privacy-preserving debiased LASSO estimator enabling unbiased parameter estimation and valid confidence interval construction under DP; and (iii) the first provably false discovery rate (FDR)-controlled DP multiple testing procedure, built upon a privacy-adapted Benjamini–Hochberg algorithm. Theoretical analysis establishes rigorous DP guarantees, statistical efficiency, and exact FDR control at the nominal level. Empirical evaluation demonstrates substantial improvements over state-of-the-art DP baselines: 42% higher sparsity identification accuracy, confidence interval coverage approaching the nominal level, and stable FDR containment strictly below the pre-specified threshold.

📝 Abstract

This paper presents novel methodologies for conducting practical differentially private (DP) estimation and inference in high-dimensional linear regression. We start by proposing a differentially private Bayesian Information Criterion (BIC) for selecting the unknown sparsity parameter in DP-Lasso, eliminating the need for prior knowledge of model sparsity, a requisite in the existing literature. Then we propose a differentially private debiased LASSO algorithm that enables privacy-preserving inference on regression parameters. Our proposed method enables accurate and private inference on the regression parameters by leveraging the inherent sparsity of high-dimensional linear regression models. Additionally, we address the issue of multiple testing in high-dimensional linear regression by introducing a differentially private multiple testing procedure that controls the false discovery rate (FDR). This allows for accurate and privacy-preserving identification of significant predictors in the regression model. Through extensive simulations and real data analysis, we demonstrate the efficacy of our proposed methods in conducting inference for high-dimensional linear models while safeguarding privacy and controlling the FDR.

Problem

Research questions and friction points this paper is trying to address.

Develops private high-dimensional regression without prior sparsity knowledge

Enables privacy-preserving inference on specific regression parameters

Controls false discovery rate in private feature selection

Innovation

Methods, ideas, or system contributions that make the work stand out.

DP-BIC selects sparsity parameter without prior knowledge

DP debiased algorithm enables privacy-preserving parameter inference

DP multiple testing procedure controls false discovery rate

🔎 Similar Papers

No similar papers found.