Training a neural netwok for data reduction and better generalization

📅 2024-11-26

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Neural networks suffer from high computational resource consumption and poor interpretability. Method: This paper proposes an adaptive sparsification training framework. It empirically identifies a phase-transition phenomenon in feature selection for nonlinear neural networks—first demonstrated in this setting. Leveraging LASSO-type regularization and compressive sensing principles, the method integrates gradient descent with hard-thresholding iterations to automatically determine the optimal regularization strength without requiring a validation set. It supports diverse loss functions and sparsity constraints, and is compatible with architectures ranging from shallow to deep networks. Contributions/Results: (1) Achieves an optimal trade-off between data compression and generalization performance; (2) Automatically identifies a small subset of critical input features that dominantly drive generalization, thereby enhancing model interpretability; (3) Releases an open-source Python toolkit enabling out-of-the-box sparse modeling and interpretable analysis.

Technology Category

Application Category

📝 Abstract

At the time of environmental concerns about artificial intelligence, in particular its need for greedy storage and computation, sparsity inducing neural networks offer a promising path towards frugality and solution for less waste. Sparse learners compress the inputs (features) by selecting only the ones needed for good generalization. A human scientist can then give an intelligent interpretation to the few selected features. If genes are the inputs and cancer type is the output, then the selected genes give the cancerologist clues on what genes have an effect on certain cancers. LASSO-type regularization leads to good input selection for linear associations, but few attempts have been made for nonlinear associations modeled as an artificial neural network. A stringent but efficient way of testing whether a feature selection method works is to check if a phase transition occurs in the probability of retrieving the relevant features, as observed and mathematically studied for linear models. Our method achieves just so for artificial neural networks, and, on real data, it has the best compromise between number of selected features and generalization performance. Our method is flexible, applying to complex models ranging from shallow to deep artificial neural networks and supporting various cost functions and sparsity-promoting penalties. It does not rely on cross-validation or on a validation set to select its single regularization parameter making it user-friendly. Our approach can be seen as a form of compressed sensing for complex models, allowing to distill high-dimensional data into a compact, interpretable subset of meaningful features, just the opposite of a black box. A python package is available at https://github.com/VcMaxouuu/AnnHarderLasso containing all the simulations and ready-to-use models.

Problem

Research questions and friction points this paper is trying to address.

Develops sparse neural networks for efficient data reduction

Selects relevant features for better generalization in nonlinear models

Provides interpretable feature selection without cross-validation reliance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse learners compress inputs for better generalization

Phase transition testing ensures relevant feature retrieval

Flexible method for shallow to deep neural networks

🔎 Similar Papers

No similar papers found.