Towards Compositional Generalization of LLMs via Skill Taxonomy Guided Data Synthesis

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work addresses the limitations of large language models in compositional generalization, which stem from the long-tailed distribution of complex skill combinations in training data, leading to insufficient instruction-following and agent task generalization. To overcome this, the authors propose the STEPS framework, which uniquely integrates hierarchical skill structure with information maximization. Leveraging structural information theory, STEPS constructs a hierarchical skill taxonomy and formulates data synthesis as a constrained information maximization problem, generating training examples that are both semantically consistent and compositionally challenging. This approach enables interpretable and systematic compositional data synthesis, significantly outperforming existing data augmentation methods across multiple instruction-following benchmarks and downstream agent tasks, thereby enhancing the model’s compositional generalization capabilities.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) and agent-based systems often struggle with compositional generalization due to a data bottleneck in which complex skill combinations follow a long-tailed, power-law distribution, limiting both instruction-following performance and generalization in agent-centric tasks. To address this challenge, we propose STEPS, a Skill Taxonomy guided Entropy-based Post-training data Synthesis framework for generating compositionally challenging data. STEPS explicitly targets compositional generalization by uncovering latent relationships among skills and organizing them into an interpretable, hierarchical skill taxonomy using structural information theory. Building on this taxonomy, we formulate data synthesis as a constrained information maximization problem, selecting skill combinations that maximize marginal structural information within the hierarchy while preserving semantic coherence. Experiments on challenging instruction-following benchmarks show that STEPS outperforms existing data synthesis baselines, while also yielding improved compositional generalization in downstream agent-based evaluations.

Problem

Research questions and friction points this paper is trying to address.

compositional generalization

data bottleneck

skill combinations

long-tailed distribution

instruction-following

Innovation

Methods, ideas, or system contributions that make the work stand out.

compositional generalization

skill taxonomy

data synthesis