On the Generalization Ability of Machine-Generated Text Detectors

📅 2024-12-23

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Existing machine-generated text (MGT) detectors for large language models (LLMs) suffer from poor generalization in academic writing—particularly under misuse scenarios such as plagiarism and fabricated citations—due to insufficient domain diversity and static evaluation paradigms. Method: We introduce MGT-Academic, the first large-scale, discipline-diverse academic MGT dataset (336M tokens), and formulate continual class-incremental attribution as a novel detection task. We propose MGTBench-2.0, a benchmark framework integrating Transformer-based detectors, cross-domain transfer learning, few-/zero-shot adaptation, and dynamic class-incremental learning. Contribution/Results: Comprehensive evaluation reveals severe generalization degradation of state-of-the-art detectors on cross-domain attribution. Our continual learning pipeline significantly enhances scalability to new disciplines; all eight evaluated adaptation strategies demonstrate consistent effectiveness. The dataset and code are publicly released to foster reproducible research.

Technology Category

Application Category

📝 Abstract

The rising popularity of large language models (LLMs) has raised concerns about machine-generated text (MGT), particularly in academic settings, where issues like plagiarism and misinformation are prevalent. As a result, developing a highly generalizable and adaptable MGT detection system has become an urgent priority. Given that LLMs are most commonly misused in academic writing, this work investigates the generalization and adaptation capabilities of MGT detectors in three key aspects specific to academic writing: First, we construct MGT-Acedemic, a large-scale dataset comprising over 336M tokens and 749K samples. MGT-Acedemic focuses on academic writing, featuring human-written texts (HWTs) and MGTs across STEM, Humanities, and Social Sciences, paired with an extensible code framework for efficient benchmarking. Second, we benchmark the performance of various detectors for binary classification and attribution tasks in both in-domain and cross-domain settings. This benchmark reveals the often-overlooked challenges of attribution tasks. Third, we introduce a novel attribution task where models have to adapt to new classes over time without (or with very limited) access to prior training data in both few-shot and many-shot scenarios. We implement eight different adapting techniques to improve the performance and highlight the inherent complexity of the task. Our findings provide insights into the generalization and adaptation ability of MGT detectors across diverse scenarios and lay the foundation for building robust, adaptive detection systems. The code framework is available at https://github.com/Y-L-LIU/MGTBench-2.0.

Problem

Research questions and friction points this paper is trying to address.

Evaluates MGT detectors in academic contexts

Benchmarks detectors across diverse academic disciplines

Introduces novel attribution task for MGT detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale academic dataset MGT-Acedemic

Benchmarking in-domain and cross-domain detection

Novel adaptation techniques for dynamic classification

🔎 Similar Papers

Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods