Beyond the Final Actor: Modeling the Dual Roles of Creator and Editor for Fine-Grained LLM-Generated Text Detection

📅 2026-04-06

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

This study addresses the challenge of distinguishing between human-authored text refined by large language models (LLMs) and LLM-generated text manually rewritten by humans—a critical limitation for fine-grained content regulation. To this end, the work proposes the first fine-grained detection framework designed for a four-class classification task, explicitly modeling both the author and editor roles in textual creation. Leveraging Rhetorical Structure Theory, the method constructs logical discourse graphs and extracts features at the elementary discourse unit level to jointly capture generative origin and editing style. Experimental results demonstrate that the proposed approach significantly outperforms twelve baseline models across multiple metrics, achieving high accuracy in fine-grained category identification while maintaining a low false positive rate, thereby effectively supporting policy-aligned content provenance requirements.

Technology Category

Application Category

📝 Abstract

The misuse of large language models (LLMs) requires precise detection of synthetic text. Existing works mainly follow binary or ternary classification settings, which can only distinguish pure human/LLM text or collaborative text at best. This remains insufficient for the nuanced regulation, as the LLM-polished human text and humanized LLM text often trigger different policy consequences. In this paper, we explore fine-grained LLM-generated text detection under a rigorous four-class setting. To handle such complexities, we propose RACE (Rhetorical Analysis for Creator-Editor Modeling), a fine-grained detection method that characterizes the distinct signatures of creator and editor. Specifically, RACE utilizes Rhetorical Structure Theory to construct a logic graph for the creator's foundation while extracting Elementary Discourse Unit-level features for the editor's style. Experiments show that RACE outperforms 12 baselines in identifying fine-grained types with low false alarms, offering a policy-aligned solution for LLM regulation.

Problem

Research questions and friction points this paper is trying to address.

LLM-generated text detection

fine-grained classification

creator-editor roles

text provenance

synthetic text regulation

Innovation

Methods, ideas, or system contributions that make the work stand out.

fine-grained detection

creator-editor modeling

Rhetorical Structure Theory