On Union-Closedness of Language Generation

📅 2025-06-23

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This work addresses the *union-closedness problem* in language generation—whether the union of finitely many (or non-uniformly) generable language classes remains generable—resolving three open questions posed by Li et al. Using constructive techniques and a novel diagonalization argument, we establish, for the first time, that the class of generable languages is *not closed under finite union*: specifically, we construct a non-uniformly generable class and a uniformly generable class whose union is not generable. Consequently, we refute the conjecture that the *eventually unbounded closure (EUC) condition* is necessary for uncountable non-uniformly generable classes, thereby settling a central open problem. These results demonstrate that limit-language generation violates the union-closedness assumption underlying classical statistical learning theory, delineating fundamental theoretical limits on the compositional capacity of generative models and highlighting a profound divergence from conventional learning frameworks.

Technology Category

Application Category

📝 Abstract

We investigate language generation in the limit - a model by Kleinberg and Mullainathan [NeurIPS 2024] and extended by Li, Raman, and Tewari [COLT 2025]. While Kleinberg and Mullainathan proved generation is possible for all countable collections, Li et al. defined a hierarchy of generation notions (uniform, non-uniform, and generatable) and explored their feasibility for uncountable collections. Our first set of results resolve two open questions of Li et al. by proving finite unions of generatable or non-uniformly generatable classes need not be generatable. These follow from a stronger result: there is a non-uniformly generatable class and a uniformly generatable class whose union is non-generatable. This adds to the aspects along which language generation in the limit is different from traditional tasks in statistical learning theory like classification, which are closed under finite unions. In particular, it implies that given two generators for different collections, one cannot combine them to obtain a single "more powerful" generator, prohibiting this notion of boosting. Our construction also addresses a third open question of Li et al. on whether there are uncountable classes that are non-uniformly generatable and do not satisfy the eventually unbounded closure (EUC) condition introduced by Li, Raman, and Tewari. Our approach utilizes carefully constructed classes along with a novel diagonalization argument that could be of independent interest in the growing area of language generation.

Problem

Research questions and friction points this paper is trying to address.

Resolving union-closedness in language generation hierarchies

Exploring generatable vs non-generatable uncountable language classes

Addressing boosting impossibility via non-uniform generatable unions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Resolved open questions on generatable class unions

Constructed non-generatable union from generatable classes

Introduced novel diagonalization for uncountable classes

🔎 Similar Papers

No similar papers found.