MGTEVAL: An Interactive Platform for Systemtic Evaluation of Machine-Generated Text Detectors

📅 2026-04-27

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

This work addresses the lack of standardized evaluation protocols for machine-generated text detectors, which hinders result comparability and reproducibility. To this end, we propose an extensible, standardized benchmarking platform integrating four core modules: dataset construction, textual adversarial attacks, detector training, and multidimensional performance evaluation. The platform supports twelve attack methods, multiple state-of-the-art detection algorithms, and configurable large language model–generated text, offering both command-line and web-based interfaces. Users can flexibly construct custom evaluation benchmarks without modifying code, significantly enhancing comparability, reproducibility, and usability. To our knowledge, this is the first effort to systematize and standardize the entire evaluation pipeline for generated text detection.

📝 Abstract

We present MGTEVAL, an extensible platform for systematic evaluation of Machine-Generated Text (MGT) detectors. Despite rapid progress in MGT detection, existing evaluations are often fragmented across datasets, preprocessing, attacks, and metrics, making results hard to compare and reproduce. MGTEVAL organizes the workflow into four components: Dataset Building, Dataset Attack, Detector Training, and Performance Evaluation. It supports constructing custom benchmarks by generating MGT with configurable LLMs, applying 12 text attacks to test sets, training detectors via a unified interface, and reporting effectiveness, robustness, and efficiency. The platform provides both command-line and Web-based interfaces for user-friendly experimentation without code rewriting.

Problem

Research questions and friction points this paper is trying to address.

Machine-Generated Text Detection

Evaluation Benchmark

Reproducibility

Robustness Evaluation

Systematic Evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

machine-generated text detection

systematic evaluation

text attack robustness

extensible evaluation platform

benchmark construction

🔎 Similar Papers

No similar papers found.