Narrative Flattening: How Post-Training Compresses Thematic, Affective, and Stylistic Variation in LLM Fiction

📅 2026-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the pervasive issue of narrative flattening in novels generated by large language models, a phenomenon characterized by diminished narrative depth and structural homogeneity whose origins and cross-domain implications remain poorly understood. The work proposes the concept of “narrative flattening” and presents the first systematic quantification of how post-training compresses narrative dynamism. Leveraging a unified OLMo-32B model series (Base/SFT/DPO/RLVR) alongside matched human benchmarks, controlled experiments are conducted across three distinct domains—StoryStar, TMAS, and The New Yorker—evaluated via sentence-level topic shifts, emotional distribution, and linguistic diversity metrics. Results reveal that post-training significantly reduces thematic transition diversity, attenuates high-intensity emotional expression, and diminishes stylistic variation, with professional literary writing experiencing the strongest compression and model outputs converging toward cross-domain homogenization.
📝 Abstract
Large language models produce fluent fiction, yet their creative output is widely seen as flat. We ask where this quality originates in the training and whether it affects different domains of human fiction equally. We construct a matched story-continuation paradigm across StoryStar (public-platform), TMAS (prompt-guided), and The New Yorker (professional literary)-and compare continuations from four OLMo 32B checkpoints (Base, SFT, DPO, RLVR) against matched human text. Because these checkpoints share architecture, scale, tokenizer, and pretraining, the design isolates the post-training effect. We measure each continuation along three sentence-level dimensions: thematic motion, affective prevalence, and linguistic diversity. Across all three, post-training compresses dynamic variation: thematic transitions become more uniform, high-intensity emotions give way to neutrality, and stylistic diversity across stories shrinks. We term this progressive loss narrative flattening. The effect is directionally stable across story domains but gap size depends on the human baseline: professional literary fiction is compressed most, while public-platform and prompt-guided stories show smaller gaps, consistent with their human baselines sitting closer to the model's default rhythm. Post-trained endpoints converge across domains, suggesting alignment produces a continuation regime largely insensitive to the source domain's narrative texture.
Problem

Research questions and friction points this paper is trying to address.

narrative flattening
large language models
fiction generation
post-training
stylistic variation
Innovation

Methods, ideas, or system contributions that make the work stand out.

narrative flattening
post-training compression
thematic variation
affective prevalence
stylistic diversity
🔎 Similar Papers
No similar papers found.