GRIP2: A Robust and Powerful Deep Knockoff Method for Feature Selection

πŸ“… 2026-01-30
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of achieving both effective feature selection and rigorous false discovery rate (FDR) control in high-correlation, low-signal-to-noise settingsβ€”a regime where existing methods often fail. The authors propose GRIP2, a novel approach that introduces an integral framework over a two-dimensional regularization path. By integrating the activity of first-layer features from a deep network across this surface, GRIP2 constructs a naturally antisymmetric feature importance statistic. Coupled with block random sampling, this enables efficient approximation from a single model training run. The method provides finite-sample FDR guarantees and substantially outperforms current deep feature selection techniques. When applied to real-world HIV drug resistance data, GRIP2 successfully recovers known resistance-associated mutations and surpasses classical linear baselines in performance.

Technology Category

Application Category

πŸ“ Abstract
Identifying truly predictive covariates while strictly controlling false discoveries remains a fundamental challenge in nonlinear, highly correlated, and low signal-to-noise regimes, where deep learning based feature selection methods are most attractive. We propose Group Regularization Importance Persistence in 2 Dimensions (GRIP2), a deep knockoff feature importance statistic that integrates first-layer feature activity over a two-dimensional regularization surface controlling both sparsity strength and sparsification geometry. To approximate this surface integral in a single training run, we introduce efficient block-stochastic sampling, which aggregates feature activity magnitudes across diverse regularization regimes along the optimization trajectory. The resulting statistics are antisymmetric by construction, ensuring finite-sample FDR control. In extensive experiments on synthetic and semi-real data, GRIP2 demonstrates improved robustness to feature correlation and noise level: in high correlation and low signal-to-noise ratio regimes where standard deep learning based feature selectors may struggle, our method retains high power and stability. Finally, on real-world HIV drug resistance data, GRIP2 recovers known resistance-associated mutations with power better than established linear baselines, confirming its reliability in practice.
Problem

Research questions and friction points this paper is trying to address.

feature selection
false discovery rate
highly correlated features
low signal-to-noise ratio
nonlinear models
Innovation

Methods, ideas, or system contributions that make the work stand out.

deep knockoff
feature selection
FDR control
regularization geometry
block-stochastic sampling
πŸ”Ž Similar Papers
No similar papers found.
B
Bob Junyi Zou
Institute for Computational and Mathematical Engineering, Stanford University, Stanford, California, USA
Lu Tian
Lu Tian
Stanford University
Biostatistics