π€ AI Summary
Sparse variable selection under differential privacy (DP) remains challenging in high-dimensional learning due to computational intractability of combinatorial search and statistical efficiency trade-offs. Method: We propose two pure DP estimators: (i) a structured sampling strategy based on the exponential mechanism, circumventing exhaustive combinatorial search; and (ii) the first integration of modern mixed-integer programming (MIP) techniques into DP variable selection, coupled with least-squares and hinge loss optimization for regression and classification, respectively. Contribution/Results: We establish rigorous statistical recovery guarantees under DP. Experiments on datasets with up to 10,000 dimensions demonstrate that our methods significantly outperform existing DP approaches, achieving state-of-the-art variable selection recovery accuracy in both regression and classification tasksβwhile simultaneously ensuring strong privacy protection, high estimation precision, and model interpretability.
π Abstract
Sparse variable selection improves interpretability and generalization in high-dimensional learning by selecting a small subset of informative features. Recent advances in Mixed Integer Programming (MIP) have enabled solving large-scale non-private sparse regression - known as Best Subset Selection (BSS) - with millions of variables in minutes. However, extending these algorithmic advances to the setting of Differential Privacy (DP) has remained largely unexplored. In this paper, we introduce two new pure differentially private estimators for sparse variable selection, levering modern MIP techniques. Our framework is general and applies broadly to problems like sparse regression or classification, and we provide theoretical support recovery guarantees in the case of BSS. Inspired by the exponential mechanism, we develop structured sampling procedures that efficiently explore the non-convex objective landscape, avoiding the exhaustive combinatorial search in the exponential mechanism. We complement our theoretical findings with extensive numerical experiments, using both least squares and hinge loss for our objective function, and demonstrate that our methods achieve state-of-the-art empirical support recovery, outperforming competing algorithms in settings with up to $p=10^4$.