MAGA-Bench: Machine-Augment-Generated Text via Alignment Detection Benchmark

πŸ“… 2026-01-08
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the growing challenge of detecting increasingly realistic machine-generated text, which fuels misinformation and exposes the limited generalization of current detectors due to training data constraints. To this end, the authors propose MAGA, a framework that enhances detector robustness and generalization by generating more challenging texts through full-process alignment. The core innovation lies in a reinforcement learning mechanism based on detector feedback (RLDF), which systematically optimizes the generation process to effectively attack and thereby strengthen detectors. Experimental results demonstrate that fine-tuning RoBERTa detectors on the MAGA dataset improves average generalization AUC by 4.60%, while MAGA-generated texts reduce the average AUC of existing detectors by 8.13%, substantially validating the framework’s effectiveness.

Technology Category

Application Category

πŸ“ Abstract
Large Language Models (LLMs) alignment is constantly evolving. Machine-Generated Text (MGT) is becoming increasingly difficult to distinguish from Human-Written Text (HWT). This has exacerbated abuse issues such as fake news and online fraud. Fine-tuned detectors'generalization ability is highly dependent on dataset quality, and simply expanding the sources of MGT is insufficient. Further augment of generation process is required. According to HC-Var's theory, enhancing the alignment of generated text can not only facilitate attacks on existing detectors to test their robustness, but also help improve the generalization ability of detectors fine-tuned on it. Therefore, we propose \textbf{M}achine-\textbf{A}ugment-\textbf{G}enerated Text via \textbf{A}lignment (MAGA). MAGA's pipeline achieves comprehensive alignment from prompt construction to reasoning process, among which \textbf{R}einforced \textbf{L}earning from \textbf{D}etectors \textbf{F}eedback (RLDF), systematically proposed by us, serves as a key component. In our experiments, the RoBERTa detector fine-tuned on MAGA training set achieved an average improvement of 4.60\% in generalization detection AUC. MAGA Dataset caused an average decrease of 8.13\% in the AUC of the selected detectors, expecting to provide indicative significance for future research on the generalization detection ability of detectors.
Problem

Research questions and friction points this paper is trying to address.

Machine-Generated Text
Alignment
Detection Generalization
Text Detection
LLM Abuse
Innovation

Methods, ideas, or system contributions that make the work stand out.

MAGA
alignment
RLDF
machine-generated text detection
generalization
πŸ”Ž Similar Papers
No similar papers found.
A
Anyang Song
College of Computer Science and Artificial Intelligence, Fudan University
Ying Cheng
Ying Cheng
Fudan University
Self-Supervised LearningMultimodal Analysis
Y
Yiqian Xu
College of Computer Science and Artificial Intelligence, Fudan University
R
Rui Feng
College of Computer Science and Artificial Intelligence, Fudan University