π€ AI Summary
This work establishes that when the number of classes \( n \) in a classification neural network satisfies \( d+2 \leq n \leq 2d \), where \( d \) denotes the feature dimensionality, the simplex equiangular tight frame assumption central to classical neural collapse theory no longer holds. By integrating Radonβs theorem with tools from convex geometry, the authors rigorously prove for the first time that, under this high-class, low-dimensional regime, the learned feature vectors at convergence asymptotically arrange themselves as the vertices of an orthoplex. This finding uncovers the precise geometric structure underlying neural collapse beyond the simplex configuration, substantially extending the theoretical applicability of neural collapse and deepening the understanding of the limiting configurations of feature representations in deep classifiers.
π Abstract
When training a neural network for classification, the feature vectors of the training set are known to collapse to the vertices of a regular simplex, provided the dimension $d$ of the feature space and the number $n$ of classes satisfies $n\leq d+1$. This phenomenon is known as neural collapse. For other applications like language models, one instead takes $n\gg d$. Here, the neural collapse phenomenon still occurs, but with different emergent geometric figures. We characterize these geometric figures in the orthoplex regime where $d+2\leq n\leq 2d$. The techniques in our analysis primarily involve Radon's theorem and convexity.