Modelling Structured Data Learning with Restricted Boltzmann Machines in the Teacher-Student Setting

📅 2024-10-21
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the learning mechanism of restricted Boltzmann machines (RBMs) for structured data within a teacher–student framework, specifically addressing whether a student RBM can recover the true latent representation from data generated by a teacher RBM exhibiting correlations among hidden variables. Using statistical-physics-based mean-field analysis and temperature-regularized inference, we systematically characterize how structural strength—quantified by the number of latent patterns and inter-pattern correlation in weight rows—affects the critical sample size required for successful learning. We find that enhanced structure drastically reduces the necessary sample complexity; in the absence of correlations, performance is independent of pattern count; and excessively low inference temperatures suppress pattern acquisition, leading to learning failure. Crucially, we establish for the first time that the student can achieve exact one-to-one or one-to-many pattern matching—surpassing the conventional two-hidden-unit limitation. These results provide the first analytically tractable generative-model foundation for the “lottery ticket hypothesis.”

Technology Category

Application Category

📝 Abstract
Restricted Boltzmann machines (RBM) are generative models capable to learn data with a rich underlying structure. We study the teacher-student setting where a student RBM learns structured data generated by a teacher RBM. The amount of structure in the data is controlled by adjusting the number of hidden units of the teacher and the correlations in the rows of the weights, a.k.a. patterns. In the absence of correlations, we validate the conjecture that the performance is independent of the number of teacher patters and hidden units of the student RBMs, and we argue that the teacher-student setting can be used as a toy model for studying the lottery ticket hypothesis. Beyond this regime, we find that the critical amount of data required to learn the teacher patterns decreases with both their number and correlations. In both regimes, we find that, even with an relatively large dataset, it becomes impossible to learn the teacher patterns if the inference temperature used for regularization is kept too low. In our framework, the student can learn teacher patterns one-to-one or many-to-one, generalizing previous findings about the teacher-student setting with two hidden units to any arbitrary finite number of hidden units.
Problem

Research questions and friction points this paper is trying to address.

Studying RBM learning of structured data in teacher-student setting
Analyzing impact of pattern correlations on critical data requirements
Exploring temperature effects on teacher pattern learnability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Student RBM learns structured teacher RBM data
Data structure controlled by hidden units and correlations
Critical data amount decreases with pattern correlations
🔎 Similar Papers
No similar papers found.
R
Robin Th'eriault
Scuola Normale Superiore di Pisa, Piazza dei Cavalieri 7, 56126, Pisa (PI), Italy
F
Francesco Tosello
Department of Mathematics, The University of British Columbia, 1984 Mathematics Road, V6T 1Z2, Vancouver, Canada
Daniele Tantari
Daniele Tantari
Università di Bologna
Statistical InferenceStatistical mechanicsQuantitative FinanceComplex NetworksSpin Glasses