MAGA-Bench: Machine-Augment-Generated Text via Alignment Detection Benchmark

📅 2026-01-08

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This study addresses the growing challenge of detecting increasingly realistic machine-generated text, which fuels misinformation and exposes the limited generalization of current detectors due to training data constraints. To this end, the authors propose MAGA, a framework that enhances detector robustness and generalization by generating more challenging texts through full-process alignment. The core innovation lies in a reinforcement learning mechanism based on detector feedback (RLDF), which systematically optimizes the generation process to effectively attack and thereby strengthen detectors. Experimental results demonstrate that fine-tuning RoBERTa detectors on the MAGA dataset improves average generalization AUC by 4.60%, while MAGA-generated texts reduce the average AUC of existing detectors by 8.13%, substantially validating the framework’s effectiveness.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) alignment is constantly evolving. Machine-Generated Text (MGT) is becoming increasingly difficult to distinguish from Human-Written Text (HWT). This has exacerbated abuse issues such as fake news and online fraud. Fine-tuned detectors'generalization ability is highly dependent on dataset quality, and simply expanding the sources of MGT is insufficient. Further augment of generation process is required. According to HC-Var's theory, enhancing the alignment of generated text can not only facilitate attacks on existing detectors to test their robustness, but also help improve the generalization ability of detectors fine-tuned on it. Therefore, we propose \textbf{M}achine-\textbf{A}ugment-\textbf{G}enerated Text via \textbf{A}lignment (MAGA). MAGA's pipeline achieves comprehensive alignment from prompt construction to reasoning process, among which \textbf{R}einforced \textbf{L}earning from \textbf{D}etectors \textbf{F}eedback (RLDF), systematically proposed by us, serves as a key component. In our experiments, the RoBERTa detector fine-tuned on MAGA training set achieved an average improvement of 4.60\% in generalization detection AUC. MAGA Dataset caused an average decrease of 8.13\% in the AUC of the selected detectors, expecting to provide indicative significance for future research on the generalization detection ability of detectors.

Problem

Research questions and friction points this paper is trying to address.

Machine-Generated Text

Alignment

Detection Generalization

Text Detection

LLM Abuse

Innovation

Methods, ideas, or system contributions that make the work stand out.

MAGA

alignment

RLDF