🤖 AI Summary
Modeling high-order synergistic effects in high-dimensional data poses a significant challenge due to the difficulty of simultaneously achieving effective feature selection, capturing complex nonlinear relationships, and conducting valid statistical inference. This work proposes a Neural Network Machine Regression (NNMR) framework that integrates trainable input gating and adaptive depth regularization to jointly perform sparse feature selection and nonparametric function estimation within an end-to-end training paradigm. To enable reliable post-selection inference without relying on parametric assumptions, the method incorporates sample splitting and permutation testing, thereby rigorously controlling Type I error rates. Empirical evaluations demonstrate that NNMR outperforms competing approaches such as Bayesian kernel machine regression in simulation studies and successfully identifies sparse, biologically interpretable dietary interaction factors in an adolescent growth study conducted in Mexico City.
📝 Abstract
We propose a new neural network framework, termed Neural Network Machine Regression (NNMR), which integrates trainable input gating and adaptive depth regularization to jointly perform feature selection and function estimation in an end-to-end manner. By penalizing both gating parameters and redundant layers, NNMR yields sparse and interpretable architectures while capturing complex nonlinear relationships driven by high-order synergistic effects. We further develop a post-selection inference procedure based on split-sample, permutation-based hypothesis testing, enabling valid inference without restrictive parametric assumptions. Compared with existing methods, including Bayesian kernel machine regression and widely used post hoc attribution techniques, NNMR scales efficiently to high-dimensional feature spaces while rigorously controlling type I error. Simulation studies demonstrate its superior selection accuracy and inference reliability. Finally, an empirical application reveals sparse, biologically meaningful food group predictors associated with somatic growth among adolescents living in Mexico City.