🤖 AI Summary
This work investigates the staged dynamic mechanisms underlying how Transformers learn latent structural knowledge. Method: Leveraging the Alchemy benchmark, we decouple structural learning into three interpretable subtasks—missing-rule inference, multi-step rule composition, and complex-example decomposition—and systematically track capability evolution in small decoder-only Transformers under a unified experimental framework. Contribution/Results: We identify a discrete, stage-wise developmental trajectory: models first rapidly acquire coarse-grained syntactic rules, then gradually integrate them into full structural representations. While robust in composing rules, they struggle to invert basic rules from complex examples—revealing a fundamental “composition-easy, analysis-hard” asymmetry. This is the first systematic characterization of the dynamic learning pathway for structural knowledge in Transformers, providing fine-grained empirical evidence for bottom-up, hierarchical knowledge construction in large language models.
📝 Abstract
While transformers can discover latent structure from context, the dynamics of how they acquire different components of the latent structure remain poorly understood. In this work, we use the Alchemy benchmark, to investigate the dynamics of latent structure learning. We train a small decoder-only transformer on three task variants: 1) inferring missing rules from partial contextual information, 2) composing simple rules to solve multi-step sequences, and 3) decomposing complex multi-step examples to infer intermediate steps. By factorizing each task into interpretable events, we show that the model acquires capabilities in discrete stages, first learning the coarse grained rules, before learning the complete latent structure. We also identify a crucial asymmetry, where the model can compose fundamental rules robustly, but struggles to decompose complex examples to discover the fundamental rules. These findings offer new insights into understanding how a transformer model learns latent structures, providing a granular view of how these capabilities evolve during training.