Who Wrote the Book? Detecting and Attributing LLM Ghostwriters

📅 2026-03-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of authorship attribution for long-form texts generated by large language models (LLMs) in out-of-distribution (OOD) scenarios, such as cross-domain settings or when the target model is unknown. To tackle this problem, the authors propose TRACE, a lightweight and interpretable fingerprinting method that constructs textual fingerprints by extracting token-level transition patterns—such as word frequency rankings—using a compact language model. They also introduce GhostWriteBench, the first book-scale benchmark for LLM-generated text attribution, comprising documents exceeding 50,000 words. Experimental results demonstrate that TRACE achieves state-of-the-art performance across both closed-source and open-source LLMs, maintaining high accuracy and robustness even under data-scarce and OOD conditions, thereby significantly enhancing generalization capabilities.
📝 Abstract
In this paper, we introduce GhostWriteBench, a dataset for LLM authorship attribution. It comprises long-form texts (50K+ words per book) generated by frontier LLMs, and is designed to test generalisation across multiple out-of-distribution (OOD) dimensions, including domain and unseen LLM author. We also propose TRACE -- a novel fingerprinting method that is interpretable and lightweight -- that works for both open- and closed-source models. TRACE creates the fingerprint by capturing token-level transition patterns (e.g., word rank) estimated by another lightweight language model. Experiments on GhostWriteBench demonstrate that TRACE achieves state-of-the-art performance, remains robust in OOD settings, and works well in limited training data scenarios.
Problem

Research questions and friction points this paper is trying to address.

authorship attribution
large language models
ghostwriting detection
out-of-distribution generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

authorship attribution
large language models
out-of-distribution generalization
fingerprinting
token-level transition patterns