Modelling Intertextuality with N-gram Embeddings

📅 2025-09-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the longstanding challenge of quantifying and scaling intertextuality analysis in literary texts. We propose a novel intertextuality modeling method based on *n*-gram embeddings: texts are segmented into *n*-gram sequences, projected into a unified semantic space, and pairwise intertextual strength is computed as the average cosine similarity across all *n*-gram embeddings—yielding an intertextual strength matrix and enabling intertextual network construction for identifying central texts and community structures. This work pioneers the systematic application of *n*-gram embeddings to intertextuality quantification, overcoming limitations of manual annotation and syntactic matching while ensuring scalability and interpretability. Empirical evaluation on four benchmark texts confirms validity; large-scale validation across 267 heterogeneous literary works demonstrates efficiency and robustness. The approach establishes a new paradigm for cross-text relational mining in digital humanities.

Technology Category

Application Category

📝 Abstract
Intertextuality is a central tenet in literary studies. It refers to the intricate links between literary texts that are created by various types of references. This paper proposes a new quantitative model of intertextuality to enable scalable analysis and network-based insights: perform pairwise comparisons of the embeddings of n-grams from two texts and average their results as the overall intertextuality. Validation on four texts with known degrees of intertextuality, alongside a scalability test on 267 diverse texts, demonstrates the method's effectiveness and efficiency. Network analysis further reveals centrality and community structures, affirming the approach's success in capturing and quantifying intertextual relationships.
Problem

Research questions and friction points this paper is trying to address.

Modeling intertextuality quantitatively using n-gram embeddings
Enabling scalable analysis of literary text relationships
Capturing and quantifying intertextual links through network structures
Innovation

Methods, ideas, or system contributions that make the work stand out.

N-gram embeddings for pairwise text comparisons
Averaging results to quantify intertextuality
Network analysis revealing centrality and communities
🔎 Similar Papers
No similar papers found.