🤖 AI Summary
This study addresses the challenge of enhancing analogical reasoning generalization in artificial intelligence systems under novel alphabets and compositional transformations. To this end, the authors propose incorporating a copying task as an intermediate training step and integrating it with the Meta-Learning Compositionality (MLC) strategy to train a three-layer encoder-decoder Transformer model. This approach steers the model toward attending to the structural essence of analogical tasks, substantially improving its generalization to unseen alphabets. Experimental results demonstrate that the model outperforms current state-of-the-art methods on heterogeneous datasets and exhibits strong compositional generalization over learned transformations. Moreover, through interpretability analyses, the study provides the first evidence of an internal algorithm within the model that approximates human-like analogical reasoning.
📝 Abstract
Analogical reasoning is a hallmark of human intelligence, enabling us to solve new problems by transferring knowledge from one situation to another. Yet, developing artificial intelligence systems capable of robust human-like analogical reasoning has proven difficult. In this work, we train transformers using Meta-Learning for Compositionality (MLC) on an analogical reasoning task (letter-string analogies) and assess their generalization capabilities. We find that letter-string analogies become learnable when guiding the models to attend to the most informative problem elements induced by including copying tasks in the training data. Furthermore, generalization to new alphabets becomes better when models are trained with more heterogeneous datasets, where our 3-layer encoder-decoder model outperforms most frontier models. The MLC approach also enables some generalization to compositions of trained transformations, but not to completely novel transformations. To understand how the model operates, we identify an algorithm that approximates the model's computations. We verify this using interpretability analyses and show that the model can be steered precisely according to expectations derived from the algorithm. Finally, we discuss implications of our findings for generalization capabilities of larger models and parallels to human analogical reasoning.