Mixture of Detectors: A Compact View of Machine-Generated Text Detection

📅 2025-09-26

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the multi-granularity and multi-scenario challenges in machine-generated text detection—specifically, document-level binary/multi-class classification (including generator attribution), sentence-level mixed-text segmentation, and robust detection under adversarial attacks. We propose a unified cross-granularity detection framework that, for the first time, integrates generator attribution, adversarial robustness, and fine-grained segmentation within a single paradigm. To support comprehensive evaluation, we introduce BMAS-English, a benchmark dataset enabling multi-task assessment. Our approach synergistically combines deep classification models, sequence labeling, and adversarial sample generation via multi-task learning. Experiments demonstrate significant improvements over state-of-the-art methods in both generator attribution and adversarial sample identification. The framework delivers a more holistic, robust, and scalable technical pathway for AIGC detection.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are gearing up to surpass human creativity. The veracity of the statement needs careful consideration. In recent developments, critical questions arise regarding the authenticity of human work and the preservation of their creativity and innovative abilities. This paper investigates such issues. This paper addresses machine-generated text detection across several scenarios, including document-level binary and multiclass classification or generator attribution, sentence-level segmentation to differentiate between human-AI collaborative text, and adversarial attacks aimed at reducing the detectability of machine-generated text. We introduce a new work called BMAS English: an English language dataset for binary classification of human and machine text, for multiclass classification, which not only identifies machine-generated text but can also try to determine its generator, and Adversarial attack addressing where it is a common act for the mitigation of detection, and Sentence-level segmentation, for predicting the boundaries between human and machine-generated text. We believe that this paper will address previous work in Machine-Generated Text Detection (MGTD) in a more meaningful way.

Problem

Research questions and friction points this paper is trying to address.

Detecting machine-generated text across document classification scenarios

Identifying human-AI text boundaries through sentence-level segmentation

Addressing adversarial attacks that reduce machine text detectability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduced BMAS dataset for binary and multiclass classification

Developed sentence-level segmentation for human-AI text boundaries

Addressed adversarial attacks to reduce machine text detectability

🔎 Similar Papers

MOSAIC: Multiple Observers Spotting AI Content, a Robust Approach to Machine-Generated Text Detection

2024-09-11Citations: 0

Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods

2024-06-21Journal of Artificial Intelligence ResearchCitations: 6

Qualcomm

$104,000.00 - $156,000.00

San Diego, California, United States of America

Authors to Follow