Scalable Evaluation and Neural Models for Compositional Generalization

πŸ“… 2025-11-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Current combinatorial generalization evaluation suffers from inconsistent benchmarks, an overemphasis on computational efficiency at the expense of rigor, and generic vision architectures lacking essential inductive biases for systematic generalization. Method: We propose (1) the first unified evaluation framework with constant-time computational complexity, enabling large-scale, systematic assessment; and (2) Attribute Invariant Networks (AIN), a novel vision backbone explicitly modeling attribute disentanglement and invariance to enhance compositional reasoning. Contribution/Results: Conducting the most comprehensive combinatorial generalization evaluation to date across 5,000+ models, AIN achieves a 23.43% absolute accuracy gain over strong baselines while reducing parameter overhead from 600% to just 16%. This establishes a new Pareto frontier in the trade-off between performance and parameter efficiency. Our framework and architecture provide both theoretical foundations and practical paradigms for interpretable, scalable generalization modeling.

Technology Category

Application Category

πŸ“ Abstract
Compositional generalization-a key open challenge in modern machine learning-requires models to predict unknown combinations of known concepts. However, assessing compositional generalization remains a fundamental challenge due to the lack of standardized evaluation protocols and the limitations of current benchmarks, which often favor efficiency over rigor. At the same time, general-purpose vision architectures lack the necessary inductive biases, and existing approaches to endow them compromise scalability. As a remedy, this paper introduces: 1) a rigorous evaluation framework that unifies and extends previous approaches while reducing computational requirements from combinatorial to constant; 2) an extensive and modern evaluation on the status of compositional generalization in supervised vision backbones, training more than 5000 models; 3) Attribute Invariant Networks, a class of models establishing a new Pareto frontier in compositional generalization, achieving a 23.43% accuracy improvement over baselines while reducing parameter overhead from 600% to 16% compared to fully disentangled counterparts. Our code is available at https://github.com/IBM/scalable-compositional-generalization.
Problem

Research questions and friction points this paper is trying to address.

Assessing compositional generalization lacks standardized evaluation protocols and rigorous benchmarks
General-purpose vision architectures lack inductive biases for compositional generalization
Existing approaches compromise scalability while attempting to improve compositional generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Rigorous evaluation framework reducing computational requirements
Attribute Invariant Networks improving accuracy with minimal parameters
Extensive evaluation of over 5000 supervised vision models
πŸ”Ž Similar Papers
No similar papers found.