🤖 AI Summary
This work investigates the theoretical foundations of safe language generation within the framework of extreme learning. It formally defines, for the first time, the tasks of safe language recognition and generation to address the challenge of avoiding harmful or policy-violating content, leveraging formal language theory and computability analysis to assess their feasibility. The study establishes that safe language recognition is generally undecidable and demonstrates that the computational complexity of safe generation is at least as hard as that of conventional language recognition. Furthermore, it delineates the boundary between tractable and intractable cases, precisely characterizing conditions under which safe generation is feasible or impossible. These results establish fundamental theoretical limits and provide a rigorous formal basis for future research on safe language generation.
📝 Abstract
Recent results in learning a language in the limit have shown that, although language identification is impossible, language generation is tractable. As this foundational area expands, we need to consider the implications of language generation in real-world settings. This work offers the first theoretical treatment of safe language generation. Building on the computational paradigm of learning in the limit, we formalize the tasks of safe language identification and generation. We prove that under this model, safe language identification is impossible, and that safe language generation is at least as hard as (vanilla) language identification, which is also impossible. Last, we discuss several intractable and tractable cases.