StoryScope: Investigating idiosyncrasies in AI fiction

📅 2026-04-03

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This study addresses the challenge of distinguishing AI-generated from human-authored fictional narratives without relying on superficial stylistic cues, and further identifies distinctive narrative signatures of different large language models. To this end, we introduce StoryScope, a pipeline that extracts fine-grained narrative features across ten discourse-level dimensions from over 60,000 stories of approximately 5,000 words each, focusing on deep structural aspects such as character agency and temporal discontinuity. Our analysis reveals fundamental differences in narrative construction: AI-generated stories exhibit thematic over-explanation and linear plot structures, whereas human narratives demonstrate greater moral ambiguity and temporal complexity. Using only these narrative features, our approach achieves a macro F1 score of 93.2% for human–AI detection and 68.4% for six-way authorship attribution; notably, a core set of 30 features retains over 97% of this performance, underscoring narrative structure as a robust signal for identifying the origin of creative text.

Technology Category

Application Category

📝 Abstract

As AI-generated fiction becomes increasingly prevalent, questions of authorship and originality are becoming central to how written work is evaluated. While most existing work in this space focuses on identifying surface-level signatures of AI writing, we ask instead whether AI-generated stories can be distinguished from human ones without relying on stylistic signals, focusing on discourse-level narrative choices such as character agency and chronological discontinuity. We propose StoryScope, a pipeline that automatically induces a fine-grained, interpretable feature space of discourse-level narrative features across 10 dimensions. We apply StoryScope to a parallel corpus of 10,272 writing prompts, each written by a human author and five LLMs, yielding 61,608 stories, each ~5,000 words, and 304 extracted features per story. Narrative features alone achieve 93.2% macro-F1 for human vs. AI detection and 68.4% macro-F1 for six-way authorship attribution, retaining over 97% of the performance of models that include stylistic cues. A compact set of 30 core narrative features captures much of this signal: AI stories over-explain themes and favor tidy, single-track plots while human stories frame protagonist' choices as more morally ambiguous and have increased temporal complexity. Per-model fingerprint features enable six-way attribution: for example, Claude produces notably flat event escalation, GPT over-indexes on dream sequences, and Gemini defaults to external character description. We find that AI-generated stories cluster in a shared region of narrative space, while human-authored stories exhibit greater diversity. More broadly, these results suggest that differences in underlying narrative construction, not just writing style, can be used to separate human-written original works from AI-generated fiction.

Problem

Research questions and friction points this paper is trying to address.

AI-generated fiction

narrative features

authorship attribution

discourse-level analysis

human vs. AI detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

narrative structure

discourse-level features

AI-generated fiction