Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction

📅 2025-05-08

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

To address the societal risks posed by high false positive rates (FPR) in large language model (LLM) misuse detection, this paper proposes the Multi-scale Conformal Prediction (MCP) framework—the first to rigorously upper-bound FPR in a zero-shot setting while preserving detection performance and reliability. Methodologically, MCP integrates conformal prediction theory, multi-scale confidence calibration, and adversarial robustness design, and introduces RealDet, a high-quality, cross-domain dataset for realistic calibration. Experiments across multiple state-of-the-art detectors and benchmarks demonstrate that MCP significantly reduces FPR (average reduction of 38.2%), improves accuracy (+4.7%), enhances cross-domain generalization, and exhibits strong robustness against textual perturbations and other adversarial attacks.

Technology Category

Application Category

📝 Abstract

The rapid advancement of large language models has raised significant concerns regarding their potential misuse by malicious actors. As a result, developing effective detectors to mitigate these risks has become a critical priority. However, most existing detection methods focus excessively on detection accuracy, often neglecting the societal risks posed by high false positive rates (FPRs). This paper addresses this issue by leveraging Conformal Prediction (CP), which effectively constrains the upper bound of FPRs. While directly applying CP constrains FPRs, it also leads to a significant reduction in detection performance. To overcome this trade-off, this paper proposes a Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction (MCP), which both enforces the FPR constraint and improves detection performance. This paper also introduces RealDet, a high-quality dataset that spans a wide range of domains, ensuring realistic calibration and enabling superior detection performance when combined with MCP. Empirical evaluations demonstrate that MCP effectively constrains FPRs, significantly enhances detection performance, and increases robustness against adversarial attacks across multiple detectors and datasets.

Problem

Research questions and friction points this paper is trying to address.

Detect machine-generated text with constrained false positive rates

Improve detection performance while bounding false positives

Enhance robustness against adversarial attacks in text detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multiscaled Conformal Prediction for FPR control

Zero-Shot detection framework improves performance

RealDet dataset enhances calibration and robustness

🔎 Similar Papers

MOSAIC: Multiple Observers Spotting AI Content, a Robust Approach to Machine-Generated Text Detection