🤖 AI Summary
Current paragraph-level machine-generated text detection methods suffer from performance limitations due to their neglect of embedded human-like writing fragments. This work presents the first theoretical analysis of how such concealed human-like segments affect detection reliability and introduces a model-agnostic, stacking-based enhancement framework. The framework employs a hard EM-inspired heuristic to iteratively identify and retain high-confidence human-like subsequences while simultaneously refining the detector. It operates in both trainable and training-free modes, consistently boosting the performance of existing detectors across diverse large language models and real-world scenarios. Notably, the approach offers the practical advantage of flexible deployment without requiring any additional training.
📝 Abstract
Machine-generated texts (MGTs) produced by large language models (LLMs) are increasingly prevalent across various applications, while their potential misuse in fake news propagation and phishing has raised serious concerns, highlighting the need for MGT detection. Existing paragraph-level detection methods commonly treat MGTs as entirely machine-like, overlooking the hidden human-like nature of machine-generated texts: even fully machine-generated texts may contain spans that are highly consistent with human writing. To this end, we first reveal the existence of such hidden human-like spans, and then theoretically analyze their impact on detection. Our analysis shows that these spans increase the sentence complexity for detection, thereby making MGT detection intrinsically harder. Based on this finding, we propose a model-agnostic stacked enhancement framework that improves existing detectors by reducing the influence of hidden human-like spans. Specifically, we model span-level retention decisions as a latent-variable problem and instantiate the optimization with a hard-EM-inspired procedure, where the detector iteratively filters confidently human-like subsequences and refines itself on the remaining text. Extensive experiments across various LLMs and practical scenarios demonstrate that the proposed framework consistently enhances existing detectors. Notably, the framework can also work in a training-free manner, offering flexibility and scalability for practical deployment.