🤖 AI Summary
Why do only recursive numeral systems—such as natural-language cardinal systems—optimally balance lexicon size and morphosyntactic complexity, while existing theories require ad hoc constraints to exclude unnatural systems? This study integrates *regularity*—a formal property from automata theory—into linguistic optimality analysis via the Minimum Description Length (MDL) principle, jointly leveraging information theory and formal language theory to quantify both regularity and processing complexity of numeral systems. Results show that recursion intrinsically enhances regularity, substantially reducing cognitive load; previously imposed artificial constraints emerge naturally as consequences of regularity differences. The work identifies regularity-driven optimization as a core mechanism underlying cross-linguistic numeral system evolution, yielding a computationally grounded, universal criterion for formal semantics, language acquisition, and artificial language design. (149 words)
📝 Abstract
Previous work has argued that recursive numeral systems optimise the trade-off between lexicon size and average morphosyntatic complexity (Denić and Szymanik, 2024). However, showing that only natural-language-like systems optimise this tradeoff has proven elusive, and the existing solution has relied on ad-hoc constraints to rule out unnatural systems (Yang and Regier, 2025). Here, we argue that this issue arises because the proposed trade-off has neglected regularity, a crucial aspect of complexity central to human grammars in general. Drawing on the Minimum Description Length (MDL) approach, we propose that recursive numeral systems are better viewed as efficient with regard to their regularity and processing complexity. We show that our MDL-based measures of regularity and processing complexity better capture the key differences between attested, natural systems and unattested but possible ones, including "optimal" recursive numeral systems from previous work, and that the ad-hoc constraints from previous literature naturally follow from regularity. Our approach highlights the need to incorporate regularity across sets of forms in studies that attempt to measure and explain optimality in language.