Advancing Machine-Generated Text Detection from an Easy to Hard Supervision Perspective

📅 2025-11-02

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

Existing machine-generated text (MGT) detection methods treat labels as deterministic “gold standards,” overlooking label imprecision arising from the ambiguous human–machine boundary, inter-annotator inconsistency, and detector overintelligence. To address this, we propose a **knowledge-distillation-free, easy-to-hard supervised augmentation framework**: a lightweight, long-text-prior easy supervisor provides reliable initial supervision signals; its outputs are formally modeled as a performance lower bound of the main detector via structured feedback, and we theoretically prove its asymptotic convergence to the latent ground-truth labels. The framework integrates coupled optimization between the supervisor and detector. Extensive experiments demonstrate substantial improvements in robustness and generalization across challenging scenarios—including cross-large-language-model, cross-domain, mixed-text, and paraphrase-attack settings—validating its effectiveness.

Technology Category

Application Category

📝 Abstract

Existing machine-generated text (MGT) detection methods implicitly assume labels as the "golden standard". However, we reveal boundary ambiguity in MGT detection, implying that traditional training paradigms are inexact. Moreover, limitations of human cognition and the superintelligence of detectors make inexact learning widespread and inevitable. To this end, we propose an easy-to-hard enhancement framework to provide reliable supervision under such inexact conditions. Distinct from knowledge distillation, our framework employs an easy supervisor targeting relatively simple longer-text detection tasks (despite weaker capabilities), to enhance the more challenging target detector. Firstly, longer texts targeted by supervisors theoretically alleviate the impact of inexact labels, laying the foundation for reliable supervision. Secondly, by structurally incorporating the detector into the supervisor, we theoretically model the supervisor as a lower performance bound for the detector. Thus, optimizing the supervisor indirectly optimizes the detector, ultimately approximating the underlying "golden" labels. Extensive experiments across diverse practical scenarios, including cross-LLM, cross-domain, mixed text, and paraphrase attacks, demonstrate the framework's significant detection effectiveness. The code is available at: https://github.com/tmlr-group/Easy2Hard.

Problem

Research questions and friction points this paper is trying to address.

Addresses boundary ambiguity in machine-generated text detection

Proposes easy-to-hard framework for reliable inexact supervision

Enhances detection across cross-LLM and adversarial scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Easy-to-hard framework enhances detectors via reliable supervision

Longer texts mitigate inexact label impacts for supervision

Structural integration models supervisor as detector's lower bound

🔎 Similar Papers

Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods