Multi-Level Contextual Token Relation Modeling for Machine-Generated Text Detection

📅 2026-05-15

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

Existing metric-based approaches for detecting machine-generated text are susceptible to the randomness inherent in generative processes, leading to biased token-level detection scores. This work is the first to reveal that these scores exhibit a multi-hop transition property and proposes a multi-level contextual token relationship modeling framework to address this issue. The framework integrates a lightweight Markov information calibration module to correct local biases and combines it with explicit, context-statistics-based logical rules for global reasoning, enabling joint optimization. The proposed method significantly enhances detection performance across diverse large language models and domains while maintaining low computational overhead.

📝 Abstract

Machine-generated texts (MGTs) pose risks such as disinformation and phishing, underscoring the need for reliable detection. Metric-based methods, which extract statistically distinguishable features of MGTs, are often more practical than complex model-based methods that are prone to overfitting. Given their diverse designs, we first place representative metric-based methods within a unified framework, enabling a clear assessment of their advantages and limitations. Our analysis identifies a core challenge across these methods: the token-level detection score is easily biased by the inherent randomness of the MGTs generation process. Then, we theoretically derive the multi-hop transitions of the token-level detection score and explore their local and global relations. Based on these findings, we propose a multi-level contextual token relation modeling framework for MGT detection. Specifically, for local relations, we model them through a lightweight Markov-informed calibration module that refines token-level evidence before aggregation. For global relations, we introduce a rule-support reasoning module that uses explicit logical rules derived from contextual score statistics. Finally, we combine the local calibrated score and the global rule-support reasoning signal in a joint multi-level inference framework. Extensive experiments show broad and substantial improvements across various real-world scenarios, including cross-LLM and cross-domain settings, with low computational overhead.

Problem

Research questions and friction points this paper is trying to address.

machine-generated text detection

token-level detection

generation randomness

metric-based methods

detection bias

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-level contextual modeling

token relation

Markov-informed calibration