READER: Reasoning-Enhanced AI-Generated Text Detection

📅 2026-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current AI-generated text detectors suffer from significant performance degradation under distribution shifts and lack interpretability. This work proposes READER, the first framework to integrate structured reasoning into AI text detection by generating discriminative rationales prior to determining text origin during inference. To support this approach, we construct READ, a high-quality dataset annotated with human-written rationales, and fine-tune a 1.5B-parameter language model on it. Experimental results demonstrate that READER consistently outperforms existing detectors and substantially surpasses much larger prompt-based models—including GPT-5.2, Gemini-3-Pro, and DeepSeek-V3.2—across diverse scenarios, achieving superior performance with a model orders of magnitude smaller than these baselines.
📝 Abstract
Recent advances in large language models (LLMs) have made it increasingly difficult to distinguish human-written text from AI-generated content. Many existing detectors train supervised neural classifiers that achieve strong in-distribution performance but are often opaque and can degrade substantially under distribution shift. We present READER, a reasoning-enhanced AI text detector that outputs both a human/AI label and a structured rationale describing the evidence for its decision. A key component of our approach is READ, a curated supervision set of rationales and verdicts. We fine-tune an LLM on READ to build READER, which reasons before detecting at inference time. Despite having only 1.5B parameters, READER consistently outperforms existing detectors as well as prompted, high-capacity LLM baselines (GPT-5.2, Gemini-3-Pro, and DeepSeek-V3.2), which are 100 to 1000 times larger in scale.
Problem

Research questions and friction points this paper is trying to address.

AI-generated text detection
large language models
distribution shift
text authenticity
Innovation

Methods, ideas, or system contributions that make the work stand out.

reasoning-enhanced detection
structured rationale
READ dataset
small-scale LLM fine-tuning
distribution robustness
🔎 Similar Papers
No similar papers found.
P
Pingfan Su
Department of Statistics, London School of Economics and Political Science
K
Kai Ye
Department of Statistics, London School of Economics and Political Science
S
Shijin Gong
School of Management, University of Science and Technology of China
E
Erhan Xu
Department of Statistics, London School of Economics and Political Science
Jin Zhu
Jin Zhu
School of Mathematics, University of Birmingham
machine learning
G
Giulia Livieri
Department of Statistics, London School of Economics and Political Science
Chengchun Shi
Chengchun Shi
London School of Economics and Political Science
Large Language ModelsReinforcement LearningStatistics