MGT-Prism: Enhancing Domain Generalization for Machine-Generated Text Detection via Spectral Alignment

📅 2025-08-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing machine-generated text (MGT) detectors suffer significant performance degradation under cross-domain settings due to domain shift. This work is the first to model the representational discrepancy between human-written and machine-generated texts from a frequency-domain perspective, revealing cross-domain stable low-frequency spectral commonalities. Based on this insight, we propose a general-purpose frequency-domain detection framework: (1) a low-frequency filtering module to suppress domain-specific noise, and (2) a dynamic spectral alignment strategy to disentangle domain-invariant, task-critical features. Our method operates in a fully unsupervised manner—requiring no target-domain labels—and is plug-and-play with mainstream text encoders. Extensive experiments across three cross-domain settings and eleven benchmark datasets demonstrate consistent improvements: average accuracy and F1 score increase by 0.90% and 0.92%, respectively, significantly outperforming state-of-the-art approaches. These results substantiate the intrinsic generalization benefit of frequency-domain modeling for MGT detection.

Technology Category

Application Category

📝 Abstract
Large Language Models have shown growing ability to generate fluent and coherent texts that are highly similar to the writing style of humans. Current detectors for Machine-Generated Text (MGT) perform well when they are trained and tested in the same domain but generalize poorly to unseen domains, due to domain shift between data from different sources. In this work, we propose MGT-Prism, an MGT detection method from the perspective of the frequency domain for better domain generalization. Our key insight stems from analyzing text representations in the frequency domain, where we observe consistent spectral patterns across diverse domains, while significant discrepancies in magnitude emerge between MGT and human-written texts (HWTs). The observation initiates the design of a low frequency domain filtering module for filtering out the document-level features that are sensitive to domain shift, and a dynamic spectrum alignment strategy to extract the task-specific and domain-invariant features for improving the detector's performance in domain generalization. Extensive experiments demonstrate that MGT-Prism outperforms state-of-the-art baselines by an average of 0.90% in accuracy and 0.92% in F1 score on 11 test datasets across three domain-generalization scenarios.
Problem

Research questions and friction points this paper is trying to address.

Detecting machine-generated text across unseen domains
Addressing domain shift in text detection models
Improving generalization via spectral feature alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spectral alignment for domain generalization
Low frequency filtering to remove domain-sensitive features
Dynamic spectrum alignment for domain-invariant detection
🔎 Similar Papers
S
Shengchao Liu
Faculty of Electronic and Information Engineering, Xi’an Jiaotong University
X
Xiaoming Liu
Faculty of Electronic and Information Engineering, Xi’an Jiaotong University
Chengzhengxu Li
Chengzhengxu Li
xianjiaotong university
LLM RL Prompting
Zhaohan Zhang
Zhaohan Zhang
Queen Mary University of London
Artificial Intelligence
G
Guoxin Ma
Faculty of Electronic and Information Engineering, Xi’an Jiaotong University
Y
Yu Lan
Faculty of Electronic and Information Engineering, Xi’an Jiaotong University
Shuai Xiao
Shuai Xiao
Alibaba Group
Machine LearningArtificial IntelligenceInformation RetrievalMultimodal Models