Black-box Detection of LLM-generated Text Using Generalized Jensen-Shannon Divergence

📅 2025-10-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Detecting LLM-generated text in black-box settings—where the source model is unknown, surrogate models are mismatched, and contrastive generation is costly—remains challenging. Method: We propose SurpMark, a lightweight detection framework that avoids contrastive generation. It dynamically constructs a Markov state transition matrix based on token surprisal and employs the generalized Jensen–Shannon divergence to quantify distributional discrepancies between test texts and human/machine reference corpora. Theoretical analysis establishes the validity of its discretization criterion and the asymptotic normality of its test statistic. Results: Extensive experiments across multiple datasets, source LLMs, and diverse scenarios demonstrate that SurpMark consistently matches or surpasses state-of-the-art baselines. Ablation studies and statistical tests further confirm the efficacy of each component and validate theoretical convergence properties.

Technology Category

Application Category

📝 Abstract
We study black-box detection of machine-generated text under practical constraints: the scoring model (proxy LM) may mismatch the unknown source model, and per-input contrastive generation is costly. We propose SurpMark, a reference-based detector that summarizes a passage by the dynamics of its token surprisals. SurpMark quantizes surprisals into interpretable states, estimates a state-transition matrix for the test text, and scores it via a generalized Jensen-Shannon (GJS) gap between the test transitions and two fixed references (human vs. machine) built once from historical corpora. We prove a principled discretization criterion and establish the asymptotic normality of the decision statistic. Empirically, across multiple datasets, source models, and scenarios, SurpMark consistently matches or surpasses baselines; our experiments corroborate the statistic's asymptotic normality, and ablations validate the effectiveness of the proposed discretization.
Problem

Research questions and friction points this paper is trying to address.

Detect machine-generated text under model mismatch constraints
Reduce costly per-input contrastive generation requirements
Quantify surprisal dynamics via generalized Jensen-Shannon divergence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses token surprisal dynamics for text detection
Quantizes surprisals into interpretable state transitions
Employs generalized Jensen-Shannon divergence for scoring
🔎 Similar Papers
No similar papers found.