Controllable Spoken Dialogue Generation: An LLM-Driven Grading System for K-12 Non-Native English Learners

📅 2026-04-24

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

This study addresses the challenge that existing large language models are poorly aligned with the linguistic proficiency of K–12 English as a Foreign Language (EFL) learners, thereby undermining instructional efficacy. To bridge this gap, the authors construct a four-level scaffolding framework grounded in China’s Standards of English Language Ability (CSE), integrating controllable generation techniques to modulate lexical complexity in dialogues. They further develop a leveled vocabulary lexicon and a multi-turn dialogue corpus tailored to these proficiency tiers. The work introduces Diversity-Driven Policy Optimization (DDPO), an algorithm that jointly enhances response diversity and overall quality in multi-turn interactions. The resulting system represents the first proficiency-adaptive dialogue generation platform specifically designed for K–12 EFL learners, demonstrating cross-standard transfer potential. Experimental results show significant reductions in out-of-vocabulary rates alongside marked improvements in dialogue diversity, naturalness, and pedagogical appropriateness. All models, data, and code will be publicly released.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) often fail to meet the pedagogical needs of K-12 English learners in non-native contexts due to a proficiency mismatch. To address this widespread challenge, we introduce a proficiency-aligned framework that adapts LLM outputs to learner abilities, using China's national curriculum (CSE) as a representative case. Our framework enables precise control over lexical complexity through a four-tier grading system, supported by a comprehensive suite of new resources: graded vocabulary lists and a multi-turn dialogue corpus. Our core technical contribution is the \textbf{DDPO} algorithm,Diversity Driven Policy Optimization, a multi-turn GRPO-based approach designed to preserve dialogue diversity while holistically optimizing dialogue quality. This method significantly outperforms conventional approaches, achieving low out-of-vocabulary rates and high diversity while enhancing conversational naturalness and pedagogical value. While grounded in the CSE, our framework is designed for flexibility and can be readily adapted to other educational standards. Our models, data, and code will all be open-sourced, providing a scalable platform for personalized English speaking practice that effectively addresses the unique challenges faced by K-12 learners in non-immersive environments.

Problem

Research questions and friction points this paper is trying to address.

Controllable Dialogue Generation

K-12 English Learners

Proficiency Alignment

Non-Native Contexts

Spoken Dialogue

Innovation

Methods, ideas, or system contributions that make the work stand out.

DDPO

proficiency-aligned dialogue generation

lexical complexity control