IASC: Interactive Agentic System for ConLangs

📅 2025-10-08

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This work addresses the low efficiency and difficulty of integrating linguistic knowledge in constructed language (conlang) design. We propose the first modular LLM-based agent system for conlang creation, decomposing the process into specialized, collaborative agents for phonology design, morphosyntactic annotation, lexicon construction, orthography generation, and grammar manual writing—coordinated via multi-step reasoning and iterative feedback. Our study demonstrates that LLMs possess transferable understanding of core linguistic concepts—including phonemic contrast, inflectional paradigms, and word order typology—and further pioneers their application in facilitating high-resource-to-low-resource language translation transfer. Experiments successfully generate multiple structurally coherent, typologically plausible artificial languages, validating LLMs’ capacity to model cross-linguistic universals. This work establishes a novel methodology and empirical foundation for computational linguistics and low-resource NLP.

Technology Category

Application Category

📝 Abstract

We present a system that uses LLMs as a tool in the development of Constructed Languages. The system is modular in that one first creates a target phonology for the language using an agentic approach that refines its output at each step with commentary feedback on its previous attempt. Next, a set of sentences is 'translated' from their English original into a morphosyntactic markup that reflects the word order and morphosyntactic feature specifications of the desired target language, with affixes represented as morphosyntactic feature bundles. From this translated corpus, a lexicon is constructed using the phonological model and the set of morphemes (stems and affixes) extracted from the 'translated' sentences. The system is then instructed to provide an orthography for the language, using an existing script such as Latin or Cyrillic. Finally, the system writes a brief grammatical handbook of the language. The system can also translate further sentences into the target language. Our goal is twofold. First, we hope that these tools will be fun to use for creating artificially constructed languages. Second, we are interested in exploring what LLMs 'know' about language-not what they know about any particular language or linguistic phenomenon, but how much they know about and understand language and linguistic concepts. As we shall see, there is a fairly wide gulf in capabilities both among different LLMs and among different linguistic specifications, with it being notably easier for systems to deal with more common patterns than rarer ones. An additional avenue that we explore is the application of our approach to translating from high-resource into low-resource languages. While the results so far are mostly negative, we provide some evidence that an improved version of the present system could afford some real gains in such tasks. https://github.com/SakanaAI/IASC

Problem

Research questions and friction points this paper is trying to address.

Developing constructed languages using LLMs through modular agentic systems

Exploring LLM knowledge of linguistic concepts and cross-language capabilities

Creating complete language systems including phonology, lexicon and grammar

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular system using LLMs for constructed language development

Agentic approach refining phonology through iterative feedback

Generates lexicon and grammar from morphosyntactic markup

🔎 Similar Papers

Automated Design of Agentic Systems