Retell, Reward, Repeat: Reinforcement Learning for Narrative Theory-Informed Story Generation

📅 2026-01-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of narrative theory–grounded training and evaluation mechanisms in existing automatic story generation approaches, which often rely heavily on limited human annotations. It proposes a linguistically principled, subjectivity-aware post-training paradigm by integrating Todorov’s narrative equilibrium theory into a reinforcement learning framework for the first time. Specifically, large language models (7B/14B) serve as critics to generate human-aligned reward signals via the d-RLAIF algorithm, with evaluations conducted using Gemini-3-Flash. Experimental results demonstrate that the proposed method significantly outperforms supervised fine-tuning in both story diversity and adherence to narrative norms, thereby validating the efficacy of reinforcement learning for subjective generative tasks grounded in literary theory.

Technology Category

Application Category

📝 Abstract
Despite the subjective nature of storytelling, past works on automatic story generation (ASG) have relied on limited ground truths for training and evaluation. In this work, we explore reinforcement learning (d-RLAIF) as a post-training alternative to supervised fine-tuning (SFT). We first apply Todorov's Theory of Narrative Equilibrium to establish principles that define desirable ASG qualities. We prompt 7B and 14B LLM-as-judge models with our principles to test alignment with human annotators and provide reward signals during d-RLAIF. We use Gemini-3-Flash to evaluate the output of our post-trained models and compare them to human-written stories from the TimeTravel dataset. We show that d-RLAIF offers a viable alternative to supervised fine-tuning (SFT)--producing stories that are more diverse and aligned with human narrative conventions. Our paper demonstrates the promise of reinforcement learning for linguistically grounded post-training for subjective tasks such as ASG.
Problem

Research questions and friction points this paper is trying to address.

automatic story generation
narrative theory
subjective evaluation
reinforcement learning
story diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

reinforcement learning
narrative theory
LLM-as-judge
post-training
story generation
🔎 Similar Papers
No similar papers found.
D
David Y. Liu
University of New South Wales, Sydney, Australia
X
Xanthe Muston
University of New South Wales, Sydney, Australia
Aditya Joshi
Aditya Joshi
Senior Lecturer/Assistant Professor, UNSW
Natural Language ProcessingAI for Social Good
S
Sebastian Sequoiah-Grayson
University of New South Wales, Sydney, Australia