🤖 AI Summary
Pretrained language models often capture spurious co-occurrence statistics rather than genuine factual associations during few-shot fine-tuning, leading to poor generalization to novel knowledge. This work first identifies and disentangles two distinct knowledge representation mechanisms in transformer-based models: low-level factual associations (grounded in truth) versus mid-level statistical co-occurrences (driven by surface patterns). Building on this analysis, we propose a novel transferable fact learning paradigm comprising two complementary components: (i) implicit fact training—enhancing factual modeling via text augmentation that implicitly encodes target facts—and (ii) co-occurrence forgetting—actively suppressing co-occurrence bias during training. Our method leverages inter-layer knowledge localization within Transformers and is validated on both synthetic and real-world datasets. Empirical results demonstrate substantial improvements on downstream tasks requiring compositional reasoning, including indirect question answering and multi-hop inference, confirming the strong transferability and robustness of the learned factual representations.
📝 Abstract
Pretrained language models can encode a large amount of knowledge and utilize it for various reasoning tasks, yet they can still struggle to learn novel factual knowledge effectively from finetuning on limited textual demonstrations. In this work, we show that the reason for this deficiency is that language models are biased to learn word co-occurrence statistics instead of true factual associations. We identify the differences between two forms of knowledge representation in language models: knowledge in the form of co-occurrence statistics is encoded in the middle layers of the transformer model and does not generalize well to reasoning scenarios beyond simple question answering, while true factual associations are encoded in the lower layers and can be freely utilized in various reasoning tasks. Based on these observations, we propose two strategies to improve the learning of factual associations in language models. We show that training on text with implicit rather than explicit factual associations can force the model to learn factual associations instead of co-occurrence statistics, significantly improving the generalization of newly learned knowledge. We also propose a simple training method to actively forget the learned co-occurrence statistics, which unblocks and enhances the learning of factual associations when training on plain narrative text. On both synthetic and real-world corpora, the two proposed strategies improve the generalization of the knowledge learned during finetuning to reasoning scenarios such as indirect and multi-hop question answering.