Learning to Substitute Components for Compositional Generalization

📅 2025-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Neural language models exhibit poor compositional generalization—especially under multi-granular structural dependencies and lexical biases, compounded by imbalanced training difficulty distributions—while hand-crafted augmentation strategies yield limited gains. To address this, we propose CompSub, a learnable component substitution mechanism, and LCS, an end-to-end optimization framework that jointly models multi-granular compositional biases. Crucially, LCS is the first to integrate compositional augmentation with adversarial loss maximization. We further extend LCS to in-context learning (ICL) for large language models, introducing LCS-ICL—a few-shot, learnable augmentation method. Evaluated on four benchmarks—SCAN, COGS, GeoQuery, and COGS-QL—our approach achieves absolute accuracy improvements of 66.5%, 10.3%, 1.4%, and 8.8%, respectively, significantly outperforming state-of-the-art methods.

Technology Category

Application Category

📝 Abstract
Despite the rising prevalence of neural language models, recent empirical evidence suggests their deficiency in compositional generalization. One of the current de-facto solutions to this problem is compositional data augmentation, which aims to introduce additional compositional inductive bias. However, existing handcrafted augmentation strategies offer limited improvement when systematic generalization of neural language models requires multi-grained compositional bias (i.e., not limited to either lexical or structural biases alone) or when training sentences have an imbalanced difficulty distribution. To address these challenges, we first propose a novel compositional augmentation strategy called Component Substitution (CompSub), which enables multi-grained composition of substantial substructures across the entire training set. Furthermore, we introduce the Learning Component Substitution (LCS) framework. This framework empowers the learning of component substitution probabilities in CompSub in an end-to-end manner by maximizing the loss of neural language models, thereby prioritizing challenging compositions with elusive concepts and novel contexts. We extend the key ideas of CompSub and LCS to the recently emerging in-context learning scenarios of pre-trained large language models (LLMs), proposing the LCS-ICL algorithm to enhance the few-shot compositional generalization of state-of-the-art (SOTA) LLMs. Theoretically, we provide insights into why applying our algorithms to language models can improve compositional generalization performance. Empirically, our results on four standard compositional generalization benchmarks(SCAN, COGS, GeoQuery, and COGS-QL) demonstrate the superiority of CompSub, LCS, and LCS-ICL, with improvements of up to 66.5%, 10.3%, 1.4%, and 8.8%, respectively.
Problem

Research questions and friction points this paper is trying to address.

Neural language models lack compositional generalization capabilities.
Existing augmentation strategies fail in multi-grained compositional bias scenarios.
Training sentences often have imbalanced difficulty distributions.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Component Substitution for multi-grained composition
Learning Component Substitution framework end-to-end
LCS-ICL algorithm enhances few-shot generalization
🔎 Similar Papers
No similar papers found.
Z
Zhaoyi Li
School of Computer Science and Technology, University of Science and Technology of China, Hefei, Anhui 230026, P.R. China
Gangwei Jiang
Gangwei Jiang
中国科学技术大学
machine learning
Chenwang Wu
Chenwang Wu
University of Science and Technology of China
Trustworthy Machine LearningData Mining
Ying Wei
Ying Wei
Zhejiang University
Machine LearningTransfer LearningContinual LearningAI for Science
D
Defu Lian
School of Computer Science and Technology, University of Science and Technology of China, Hefei, Anhui 230026, P.R. China
Enhong Chen
Enhong Chen
University of Science and Technology of China
data miningrecommender systemmachine learning