🤖 AI Summary
Existing global dimension estimation methods for neural representations—e.g., eigenvalue participation ratio—are highly sensitive to sample size, exhibiting substantial bias under limited sampling or noisy conditions. This work introduces a robust, small-sample–invariant dimension estimator grounded in theoretical eigenvalue correction, weighted subsampling, and explicit noise modeling—yielding the first estimator whose estimates are stable with respect to both sample count and noise level. We further generalize the framework to estimate local intrinsic dimension on curved manifolds. On synthetic benchmarks, the method recovers ground-truth dimensions with high fidelity. Across diverse neural data modalities—including calcium imaging, electrophysiology, fMRI, and large language model activations—it demonstrates strong sample-size invariance and superior accuracy. This substantially improves reliability of dimensionality estimation under data-limited regimes, enabling more trustworthy representational analyses in both biological and artificial neural networks.
📝 Abstract
The global dimensionality of a neural representation manifold provides rich insight into the computational process underlying both artificial and biological neural networks. However, all existing measures of global dimensionality are sensitive to the number of samples, i.e., the number of rows and columns of the sample matrix. We show that, in particular, the participation ratio of eigenvalues, a popular measure of global dimensionality, is highly biased with small sample sizes, and propose a bias-corrected estimator that is more accurate with finite samples and with noise. On synthetic data examples, we demonstrate that our estimator can recover the true known dimensionality. We apply our estimator to neural brain recordings, including calcium imaging, electrophysiological recordings, and fMRI data, and to the neural activations in a large language model and show our estimator is invariant to the sample size. Finally, our estimators can additionally be used to measure the local dimensionalities of curved neural manifolds by weighting the finite samples appropriately.