🤖 AI Summary
This work addresses the fundamental problem of determining the minimal sample size required for weak reconstruction of the relevant index subspace in high-dimensional multi-index models. To overcome the long-standing absence of theoretical characterizations beyond the single-index setting, we propose a spectral algorithm based on message-passing linearization. This is the first method to precisely identify the sharp phase transition threshold for weak reconstruction in the multi-index regime. Leveraging tools from high-dimensional random matrix theory and spiked covariance models, we rigorously establish that this threshold coincides with the information-theoretic optimal limit; above it, the leading eigenvector exhibits significant alignment with the true subspace. Numerical experiments confirm a BBP-type phase transition behavior. Our work fills a critical gap in the weak-reconstruction theory for multi-index models and establishes the first optimal spectral algorithm with an exact phase transition characterization.
📝 Abstract
We consider the problem of how many samples from a Gaussian multi-index model are required to weakly reconstruct the relevant index subspace. Despite its increasing popularity as a testbed for investigating the computational complexity of neural networks, results beyond the single-index setting remain elusive. In this work, we introduce spectral algorithms based on the linearization of a message passing scheme tailored to this problem. Our main contribution is to show that the proposed methods achieve the optimal reconstruction threshold. Leveraging a high-dimensional characterization of the algorithms, we show that above the critical threshold the leading eigenvector correlates with the relevant index subspace, a phenomenon reminiscent of the Baik-Ben Arous-Peche (BBP) transition in spiked models arising in random matrix theory. Supported by numerical experiments and a rigorous theoretical framework, our work bridges critical gaps in the computational limits of weak learnability in multi-index model.