Compositional Generalization in Autoregressive Models via Logit Composition

📅 2026-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of effectively composing multiple tasks or skills within autoregressive models while preventing interference among components. Inspired by diffusion models, the authors propose a projective composition strategy grounded in a factorized conditional independence assumption, introducing projectivity—previously unexplored in autoregressive systems—for the first time. They demonstrate that this property holds under smooth reparameterizations of the output space. By combining logits and reparameterizing the feature space, the method enables each component to independently govern a subspace of the output distribution. Theoretical analysis establishes sufficient conditions for successful composition, guaranteeing behavioral stability, non-interference, and support for length generalization.
📝 Abstract
Composing autoregressive models remains a core challenge in understanding how large language models can combine behaviors or skills learned across tasks. We introduce a new and principled composition strategy for autoregressive systems, inspired by composition methods developed for diffusion models. Under a factorized-conditionals assumption, we show that the resulting composition is projective: each component model preserves control over its own designated subspace of the output distribution avoiding interference between models. This property is further preserved under smooth reparameterizations of the output space, yielding a feature-space theorem. Finally, we show that composition preserves length-generalizing behavior when the factorization assumptions and component guarantees hold uniformly at the target length. These results provide a principled understanding of when model composition and merging succeed in autoregressive systems and identify conditions under which their interactions remain stable.
Problem

Research questions and friction points this paper is trying to address.

Compositional Generalization
Autoregressive Models
Model Composition
Factorized Conditionals
Length Generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

logit composition
compositional generalization
autoregressive models
factorized conditionals
length generalization