Contrastive Learning with Narrative Twins for Modeling Story Salience

📅 2026-01-12

📈 Citations: 0

✨ Influential: 0

career value

150K/year

🤖 AI Summary

This study addresses the challenge of identifying narratively salient events—those critical to story progression—by proposing a contrastive learning framework based on “narrative twins,” defined as pairs of stories sharing the same plot but differing in surface form. The model learns story embeddings that capture narrative salience by distinguishing original stories from their narrative twins and from semantically similar yet plot-divergent distractors. Key innovations include the novel use of narrative twins to construct contrastive tasks and the integration of four narrative perturbations—deletion, displacement, scrambling, and summarization—to evaluate salience. When twin data are unavailable, the approach substitutes them with either random dropout or large language model–generated alternatives. Experiments on ROCStories and long-form Wikipedia plot summaries demonstrate that the learned embeddings substantially outperform masked language model baselines, with summarization emerging as the most reliable operation for identifying salient sentences.

Technology Category

Application Category

📝 Abstract

Understanding narratives requires identifying which events are most salient for a story's progression. We present a contrastive learning framework for modeling narrative salience that learns story embeddings from narrative twins: stories that share the same plot but differ in surface form. Our model is trained to distinguish a story from both its narrative twin and a distractor with similar surface features but different plot. Using the resulting embeddings, we evaluate four narratologically motivated operations for inferring salience (deletion, shifting, disruption, and summarization). Experiments on short narratives from the ROCStories corpus and longer Wikipedia plot summaries show that contrastively learned story embeddings outperform a masked-language-model baseline, and that summarization is the most reliable operation for identifying salient sentences. If narrative twins are not available, random dropout can be used to generate the twins from a single story. Effective distractors can be obtained either by prompting LLMs or, in long-form narratives, by using different parts of the same story.

Problem

Research questions and friction points this paper is trying to address.

narrative salience

story understanding

event importance

narrative modeling

text summarization

Innovation

Methods, ideas, or system contributions that make the work stand out.

contrastive learning

narrative twins

story salience