🤖 AI Summary
Detecting LLM-generated text in black-box settings—where the source model is unknown, surrogate models are mismatched, and contrastive generation is costly—remains challenging. Method: We propose SurpMark, a lightweight detection framework that avoids contrastive generation. It dynamically constructs a Markov state transition matrix based on token surprisal and employs the generalized Jensen–Shannon divergence to quantify distributional discrepancies between test texts and human/machine reference corpora. Theoretical analysis establishes the validity of its discretization criterion and the asymptotic normality of its test statistic. Results: Extensive experiments across multiple datasets, source LLMs, and diverse scenarios demonstrate that SurpMark consistently matches or surpasses state-of-the-art baselines. Ablation studies and statistical tests further confirm the efficacy of each component and validate theoretical convergence properties.
📝 Abstract
We study black-box detection of machine-generated text under practical constraints: the scoring model (proxy LM) may mismatch the unknown source model, and per-input contrastive generation is costly. We propose SurpMark, a reference-based detector that summarizes a passage by the dynamics of its token surprisals. SurpMark quantizes surprisals into interpretable states, estimates a state-transition matrix for the test text, and scores it via a generalized Jensen-Shannon (GJS) gap between the test transitions and two fixed references (human vs. machine) built once from historical corpora. We prove a principled discretization criterion and establish the asymptotic normality of the decision statistic. Empirically, across multiple datasets, source models, and scenarios, SurpMark consistently matches or surpasses baselines; our experiments corroborate the statistic's asymptotic normality, and ablations validate the effectiveness of the proposed discretization.