Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction

๐Ÿ“… 2025-05-08
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the societal risks posed by high false positive rates (FPR) in large language model (LLM) misuse detection, this paper proposes the Multi-scale Conformal Prediction (MCP) frameworkโ€”the first to rigorously upper-bound FPR in a zero-shot setting while preserving detection performance and reliability. Methodologically, MCP integrates conformal prediction theory, multi-scale confidence calibration, and adversarial robustness design, and introduces RealDet, a high-quality, cross-domain dataset for realistic calibration. Experiments across multiple state-of-the-art detectors and benchmarks demonstrate that MCP significantly reduces FPR (average reduction of 38.2%), improves accuracy (+4.7%), enhances cross-domain generalization, and exhibits strong robustness against textual perturbations and other adversarial attacks.

Technology Category

Application Category

๐Ÿ“ Abstract
The rapid advancement of large language models has raised significant concerns regarding their potential misuse by malicious actors. As a result, developing effective detectors to mitigate these risks has become a critical priority. However, most existing detection methods focus excessively on detection accuracy, often neglecting the societal risks posed by high false positive rates (FPRs). This paper addresses this issue by leveraging Conformal Prediction (CP), which effectively constrains the upper bound of FPRs. While directly applying CP constrains FPRs, it also leads to a significant reduction in detection performance. To overcome this trade-off, this paper proposes a Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction (MCP), which both enforces the FPR constraint and improves detection performance. This paper also introduces RealDet, a high-quality dataset that spans a wide range of domains, ensuring realistic calibration and enabling superior detection performance when combined with MCP. Empirical evaluations demonstrate that MCP effectively constrains FPRs, significantly enhances detection performance, and increases robustness against adversarial attacks across multiple detectors and datasets.
Problem

Research questions and friction points this paper is trying to address.

Detect machine-generated text with constrained false positive rates
Improve detection performance while bounding false positives
Enhance robustness against adversarial attacks in text detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multiscaled Conformal Prediction for FPR control
Zero-Shot detection framework improves performance
RealDet dataset enhances calibration and robustness
๐Ÿ”Ž Similar Papers
No similar papers found.
Xiaowei Zhu
Xiaowei Zhu
Ant Research
Graph DatabaseBig Data SystemsPrivacy-Preserving ComputationAI Infra
Y
Yubing Ren
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Yanan Cao
Yanan Cao
Institute of Information Engineering, Chinese Academy of Sciences
Xixun Lin
Xixun Lin
Institute of Information Engineering, Chinese Academy of Sciences
Data miningGraph representation learningLarge language model
F
Fang Fang
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Y
Yangxi Li
National Computer Network Emergency Response Technical Team, Coordination Center of China, Beijing, China