LM$^2$otifs : An Explainable Framework for Machine-Generated Texts Detection

📅 2025-05-18

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

To address the lack of interpretability in large language model (LLM) text detection, this paper proposes a multi-granularity interpretable framework. It constructs structured text representations via word co-occurrence graphs and integrates explainable graph neural networks (xGNNs) with probabilistic graphical models to enable hierarchical extraction and theoretically grounded attribution of semantic motifs—the first such approach. The method delivers fine-grained explanations spanning lexical to syntactic levels, uncovering distinctive linguistic fingerprints of machine-generated text. It achieves state-of-the-art detection performance across multiple benchmark datasets; empirical evaluation confirms that the extracted motifs effectively discriminate human- from machine-authored texts, while qualitative analysis demonstrates stability and reproducibility. Key contributions include: (i) the first integration of xGNNs with probabilistic graphical modeling; (ii) a motif-driven, multi-level attribution mechanism; and (iii) a verifiable, generalizable paradigm for discovering machine-language features.

Technology Category

Application Category

📝 Abstract

The impressive ability of large language models to generate natural text across various tasks has led to critical challenges in authorship authentication. Although numerous detection methods have been developed to differentiate between machine-generated texts (MGT) and human-generated texts (HGT), the explainability of these methods remains a significant gap. Traditional explainability techniques often fall short in capturing the complex word relationships that distinguish HGT from MGT. To address this limitation, we present LM$^2$otifs, a novel explainable framework for MGT detection. Inspired by probabilistic graphical models, we provide a theoretical rationale for the effectiveness. LM$^2$otifs utilizes eXplainable Graph Neural Networks to achieve both accurate detection and interpretability. The LM$^2$otifs pipeline operates in three key stages: first, it transforms text into graphs based on word co-occurrence to represent lexical dependencies; second, graph neural networks are used for prediction; and third, a post-hoc explainability method extracts interpretable motifs, offering multi-level explanations from individual words to sentence structures. Extensive experiments on multiple benchmark datasets demonstrate the comparable performance of LM$^2$otifs. The empirical evaluation of the extracted explainable motifs confirms their effectiveness in differentiating HGT and MGT. Furthermore, qualitative analysis reveals distinct and visible linguistic fingerprints characteristic of MGT.

Problem

Research questions and friction points this paper is trying to address.

Detecting machine-generated texts with explainable methods

Capturing complex word relationships in authorship authentication

Providing multi-level explanations for text classification decisions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses eXplainable Graph Neural Networks

Transforms text into word co-occurrence graphs

Extracts interpretable motifs for multi-level explanations

🔎 Similar Papers

Detecting Machine-Generated Texts: Not Just "AI vs Humans" and Explainability is Complicated