Machine Text Detectors are Membership Inference Attacks

📅 2025-10-22

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Membership inference attacks (MIAs) and machine-generated text detection share fundamental theoretical and methodological connections, both relying on language model output probability distributions and sharing identical theoretically optimal discrimination metrics. Method: To validate cross-task transferability, we propose MINT—a unified evaluation framework integrating 15 state-of-the-art methods, supporting multi-generator, multi-domain, cross-task assessment. Contribution/Results: Experiments reveal strong rank correlation (Spearman ρ > 0.6) between MIA and detection performance. Notably, Binoculars—an MIA method—achieves state-of-the-art detection accuracy, and conversely, top detection methods excel at MIAs. This work establishes high mutual transferability between the two tasks, bridging previously disjoint research communities. It introduces a unified analytical paradigm and technical foundation for trustworthy AI, enabling synergistic advances in privacy auditing and synthetic content detection.

Technology Category

Application Category

📝 Abstract

Although membership inference attacks (MIAs) and machine-generated text detection target different goals, identifying training samples and synthetic texts, their methods often exploit similar signals based on a language model's probability distribution. Despite this shared methodological foundation, the two tasks have been independently studied, which may lead to conclusions that overlook stronger methods and valuable insights developed in the other task. In this work, we theoretically and empirically investigate the transferability, i.e., how well a method originally developed for one task performs on the other, between MIAs and machine text detection. For our theoretical contribution, we prove that the metric that achieves the asymptotically highest performance on both tasks is the same. We unify a large proportion of the existing literature in the context of this optimal metric and hypothesize that the accuracy with which a given method approximates this metric is directly correlated with its transferability. Our large-scale empirical experiments, including 7 state-of-the-art MIA methods and 5 state-of-the-art machine text detectors across 13 domains and 10 generators, demonstrate very strong rank correlation (rho > 0.6) in cross-task performance. We notably find that Binoculars, originally designed for machine text detection, achieves state-of-the-art performance on MIA benchmarks as well, demonstrating the practical impact of the transferability. Our findings highlight the need for greater cross-task awareness and collaboration between the two research communities. To facilitate cross-task developments and fair evaluations, we introduce MINT, a unified evaluation suite for MIAs and machine-generated text detection, with implementation of 15 recent methods from both tasks.

Problem

Research questions and friction points this paper is trying to address.

Investigating transferability between membership inference attacks and text detection

Unifying optimal metrics for both tasks through theoretical analysis

Demonstrating cross-task performance correlation via large-scale empirical experiments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unifies membership inference and text detection methods

Proves optimal metric achieves highest cross-task performance

Introduces MINT evaluation suite for cross-task assessment

🔎 Similar Papers

No similar papers found.