🤖 AI Summary
Transformers exhibit weak systematic and compositional out-of-distribution (OOD) generalization—particularly hindering performance on complex reasoning tasks.
Method: We propose a recursive latent-space reasoning framework built upon standard Transformer architecture, integrating four novel mechanisms: (1) input-adaptive recurrent structure, (2) algorithm-level supervision signals, (3) discrete bottleneck-constrained latent representations, and (4) an explicit error-feedback-driven correction module. The resulting model is modular, scalable, and interpretable.
Contributions/Results: Evaluated on compositional arithmetic reasoning tasks (e.g., GSM8K-style problems) over computational graphs, our approach achieves significant gains in OOD generalization. Interpretability analysis confirms that the mechanisms jointly induce robust, structured reasoning paths—demonstrating improved logical compositionality. This work establishes a new paradigm for enhancing the logical generalization capability of large language models.
📝 Abstract
Systematic, compositional generalization beyond the training distribution remains a core challenge in machine learning -- and a critical bottleneck for the emergent reasoning abilities of modern language models. This work investigates out-of-distribution (OOD) generalization in Transformer networks using a GSM8K-style modular arithmetic on computational graphs task as a testbed. We introduce and explore a set of four architectural mechanisms aimed at enhancing OOD generalization: (i) input-adaptive recurrence; (ii) algorithmic supervision; (iii) anchored latent representations via a discrete bottleneck; and (iv) an explicit error-correction mechanism. Collectively, these mechanisms yield an architectural approach for native and scalable latent space reasoning in Transformer networks with robust algorithmic generalization capabilities. We complement these empirical results with a detailed mechanistic interpretability analysis that reveals how these mechanisms give rise to robust OOD generalization abilities.