Mixture of Detectors: A Compact View of Machine-Generated Text Detection

📅 2025-09-26
📈 Citations: 0
Influential: 0
📄 PDF

career value

204K/year
🤖 AI Summary
This work addresses the multi-granularity and multi-scenario challenges in machine-generated text detection—specifically, document-level binary/multi-class classification (including generator attribution), sentence-level mixed-text segmentation, and robust detection under adversarial attacks. We propose a unified cross-granularity detection framework that, for the first time, integrates generator attribution, adversarial robustness, and fine-grained segmentation within a single paradigm. To support comprehensive evaluation, we introduce BMAS-English, a benchmark dataset enabling multi-task assessment. Our approach synergistically combines deep classification models, sequence labeling, and adversarial sample generation via multi-task learning. Experiments demonstrate significant improvements over state-of-the-art methods in both generator attribution and adversarial sample identification. The framework delivers a more holistic, robust, and scalable technical pathway for AIGC detection.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are gearing up to surpass human creativity. The veracity of the statement needs careful consideration. In recent developments, critical questions arise regarding the authenticity of human work and the preservation of their creativity and innovative abilities. This paper investigates such issues. This paper addresses machine-generated text detection across several scenarios, including document-level binary and multiclass classification or generator attribution, sentence-level segmentation to differentiate between human-AI collaborative text, and adversarial attacks aimed at reducing the detectability of machine-generated text. We introduce a new work called BMAS English: an English language dataset for binary classification of human and machine text, for multiclass classification, which not only identifies machine-generated text but can also try to determine its generator, and Adversarial attack addressing where it is a common act for the mitigation of detection, and Sentence-level segmentation, for predicting the boundaries between human and machine-generated text. We believe that this paper will address previous work in Machine-Generated Text Detection (MGTD) in a more meaningful way.
Problem

Research questions and friction points this paper is trying to address.

Detecting machine-generated text across document classification scenarios
Identifying human-AI text boundaries through sentence-level segmentation
Addressing adversarial attacks that reduce machine text detectability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduced BMAS dataset for binary and multiclass classification
Developed sentence-level segmentation for human-AI text boundaries
Addressed adversarial attacks to reduce machine text detectability
S
Sai Teja Lekkala
Computer Science & Engineering, NIT Silchar, India.
Y
Yadagiri Annepaka
Computer Science & Engineering, NIT Silchar, India.
A
Arun Kumar Challa
Electrical Engineering, NIT Silchar, India.
S
Samatha Reddy Machireddy
Computer Science & Engineering, NIT Silchar, India.
Partha Pakray
Partha Pakray
Associate Professor at National Institute of Technology Silchar India
Textual EntailmentQuestion AnsweringAnswer Validation
C
Chukhu Chunka
Computer Science & Engineering, NIT Silchar, India.