🤖 AI Summary
This work addresses the *union-closedness problem* in language generation—whether the union of finitely many (or non-uniformly) generable language classes remains generable—resolving three open questions posed by Li et al. Using constructive techniques and a novel diagonalization argument, we establish, for the first time, that the class of generable languages is *not closed under finite union*: specifically, we construct a non-uniformly generable class and a uniformly generable class whose union is not generable. Consequently, we refute the conjecture that the *eventually unbounded closure (EUC) condition* is necessary for uncountable non-uniformly generable classes, thereby settling a central open problem. These results demonstrate that limit-language generation violates the union-closedness assumption underlying classical statistical learning theory, delineating fundamental theoretical limits on the compositional capacity of generative models and highlighting a profound divergence from conventional learning frameworks.
📝 Abstract
We investigate language generation in the limit - a model by Kleinberg and Mullainathan [NeurIPS 2024] and extended by Li, Raman, and Tewari [COLT 2025]. While Kleinberg and Mullainathan proved generation is possible for all countable collections, Li et al. defined a hierarchy of generation notions (uniform, non-uniform, and generatable) and explored their feasibility for uncountable collections.
Our first set of results resolve two open questions of Li et al. by proving finite unions of generatable or non-uniformly generatable classes need not be generatable. These follow from a stronger result: there is a non-uniformly generatable class and a uniformly generatable class whose union is non-generatable. This adds to the aspects along which language generation in the limit is different from traditional tasks in statistical learning theory like classification, which are closed under finite unions. In particular, it implies that given two generators for different collections, one cannot combine them to obtain a single "more powerful" generator, prohibiting this notion of boosting.
Our construction also addresses a third open question of Li et al. on whether there are uncountable classes that are non-uniformly generatable and do not satisfy the eventually unbounded closure (EUC) condition introduced by Li, Raman, and Tewari. Our approach utilizes carefully constructed classes along with a novel diagonalization argument that could be of independent interest in the growing area of language generation.