Storage capacity of perceptron with variable selection

📅 2025-12-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the fundamental challenge of distinguishing true structural signals from spurious correlations in high-dimensional data. We investigate the storage capacity—i.e., the maximum pattern load α = P/N—of a sparse perceptron that selects exactly M = ρN relevant variables from N total variables to perfectly classify P = αN random binary patterns. Moving beyond the classical Cover–Gardner geometric framework, we employ the statistical-mechanical replica method to rigorously derive an exact analytical relation between the sparsity ratio ρ and the critical capacity α(ρ). Our analysis quantitatively reveals that variable sparsity enhances storage capacity, and identifies a sharp critical threshold ρ_c: below ρ_c, the system fails to reliably recover the underlying structure. This yields a precise, quantitative criterion for the identifiability of latent data structure and establishes the first rigorous capacity theory for sparse-coupling associative memory models.

Technology Category

Application Category

📝 Abstract
A central challenge in machine learning is to distinguish genuine structure from chance correlations in high-dimensional data. In this work, we address this issue for the perceptron, a foundational model of neural computation. Specifically, we investigate the relationship between the pattern load $α$ and the variable selection ratio $ρ$ for which a simple perceptron can perfectly classify $P = αN$ random patterns by optimally selecting $M = ρN$ variables out of $N$ variables. While the Cover--Gardner theory establishes that a random subset of $ρN$ dimensions can separate $αN$ random patterns if and only if $α< 2ρ$, we demonstrate that optimal variable selection can surpass this bound by developing a method, based on the replica method from statistical mechanics, for enumerating the combinations of variables that enable perfect pattern classification. This not only provides a quantitative criterion for distinguishing true structure in the data from spurious regularities, but also yields the storage capacity of associative memory models with sparse asymmetric couplings.
Problem

Research questions and friction points this paper is trying to address.

Investigates perceptron's variable selection impact on storage capacity
Develops method to surpass Cover-Gardner bound using replica technique
Distinguishes genuine structure from spurious correlations in high-dimensional data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Perceptron model with optimal variable selection
Replica method for enumerating variable combinations
Surpassing Cover-Gardner bound via statistical mechanics
🔎 Similar Papers
No similar papers found.