SRS-Stories: Vocabulary-constrained multilingual story generation for language learning

📅 2025-12-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of generating personalized reading materials for language learning: producing natural, grammatically correct, and semantically coherent texts that strictly incorporate target new words and spaced-repetition-scheduled review words, while respecting learners’ known vocabulary constraints. We propose the first multilingual story generation framework that deeply integrates lexical-constrained text generation with a spaced repetition system (SRS). Our approach comprises fine-tuned large language models (LLMs), three novel lexical-constraint decoding strategies, a cross-lingual (English/Chinese/Polish) generation architecture, and a cognitively grounded SRS-driven dynamic scheduling algorithm. Experiments demonstrate that our method significantly outperforms baseline constrained beam search across grammaticality, coherence, and lexical usage quality—achieving state-of-the-art performance in both human evaluation and automated metrics.

Technology Category

Application Category

📝 Abstract
In this paper, we use large language models to generate personalized stories for language learners, using only the vocabulary they know. The generated texts are specifically written to teach the user new vocabulary by simply reading stories where it appears in context, while at the same time seamlessly reviewing recently learned vocabulary. The generated stories are enjoyable to read and the vocabulary reviewing/learning is optimized by a Spaced Repetition System. The experiments are conducted in three languages: English, Chinese and Polish, evaluating three story generation methods and three strategies for enforcing lexical constraints. The results show that the generated stories are more grammatical, coherent, and provide better examples of word usage than texts generated by the standard constrained beam search approach
Problem

Research questions and friction points this paper is trying to address.

Generates personalized stories for language learners using known vocabulary
Teaches new vocabulary through context while reviewing recently learned words
Optimizes vocabulary learning with a Spaced Repetition System in multiple languages
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses large language models for personalized story generation
Integrates Spaced Repetition System to optimize vocabulary learning
Enforces lexical constraints across multiple languages effectively
🔎 Similar Papers
No similar papers found.
W
Wiktor Kamzela
Poznan University of Technology, Institute of Computer Science, Poznan, Poland
Mateusz Lango
Mateusz Lango
Charles University / Poznan University of Technology
natural language processingmachine learningexplainable AI
O
Ondrej Dusek
Charles University, Faculty of Mathematics and Physics, Prague, Czechia