🤖 AI Summary
This work investigates the inductive bias mechanisms of deep neural networks (DNNs) on Boolean data, aiming to establish an end-to-end analyzable theoretical framework linking prior assumptions, training dynamics—particularly feature learning—to generalization performance. Methodologically, it constructs a rigorous correspondence between depth-2 discrete fully connected networks and DNF logical formulas, enabling, for the first time, formal characterization of how feature formation and inductive bias jointly drive generalization under Boolean function modeling. By integrating Monte Carlo learning algorithms with DNF interpretability analysis, the resulting framework ensures training predictability, feature interpretability, and analyzability of generalization behavior. It overcomes the inherent limitation of continuous approximation methods—such as the Neural Tangent Kernel (NTK)—which neglect feature learning. The work thus establishes a novel paradigm for understanding the fundamental learning principles of DNNs on discrete structures.
📝 Abstract
Deep neural networks are renowned for their ability to generalise well across diverse tasks, even when heavily overparameterized. Existing works offer only partial explanations (for example, the NTK-based task-model alignment explanation neglects feature learning). Here, we provide an end-to-end, analytically tractable case study that links a network's inductive prior, its training dynamics including feature learning, and its eventual generalisation. Specifically, we exploit the one-to-one correspondence between depth-2 discrete fully connected networks and disjunctive normal form (DNF) formulas by training on Boolean functions. Under a Monte Carlo learning algorithm, our model exhibits predictable training dynamics and the emergence of interpretable features. This framework allows us to trace, in detail, how inductive bias and feature formation drive generalisation.