Measuring Embedding Sensitivity to Authorial Style in French: Comparing Literary Texts with Language Model Rewritings

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

191K/year
🤖 AI Summary
This study investigates the extent to which authorial style information is preserved and detectable in embeddings of texts rewritten by large language models (LLMs). Leveraging a controlled French literary dataset, the authors systematically evaluate differences in stylistic signals between original and LLM-rewritten texts through embedding space analysis and dispersion metrics. The work provides the first evidence in French that embeddings effectively capture author-specific stylistic features, which remain significantly present even after LLM rewriting, while also exhibiting an additional layer of model-specific generation patterns. These findings offer novel insights and a technical foundation for detecting author imitation in machine-generated text.
📝 Abstract
Large language models (LLMs) can convincingly imitate human writing styles, yet it remains unclear how much stylistic information is encoded in embeddings from any language model and retained after LLM rewriting. We investigate these questions in French, using a controlled literary dataset to quantify the effect of stylistic variation via changes in embedding dispersion. We observe that embeddings reliably capture authorial stylistic features and that these signals persist after rewriting, while also exhibiting LLM-specific patterns. These analytical results offer promising directions for authorship imitation detection in the era of language models.
Problem

Research questions and friction points this paper is trying to address.

authorial style
embedding sensitivity
language model rewriting
stylistic variation
French literary texts
Innovation

Methods, ideas, or system contributions that make the work stand out.

embedding sensitivity
authorial style
language model rewriting
stylistic variation
authorship imitation detection
🔎 Similar Papers
No similar papers found.