🤖 AI Summary
This study investigates structural and creative disparities between large language model (LLM)-generated and human-written texts across canonical paragraph types—introduction, body, and conclusion—to uncover AI-specific “creativity bottlenecks” and improve detection reliability.
Method: We propose a chessboard-inspired text segmentation strategy, integrating linguistic style analysis, paragraph-level comparative evaluation, and statistical significance testing.
Contribution/Results: Our analysis reveals, for the first time, a critical contradiction in LLM outputs: high surface plausibility coupled with high internal homogeneity—particularly pronounced in body paragraphs—yet easily masked by superficial coherence. In contrast, human texts exhibit 47% greater cross-paragraph stylistic divergence on average. Leveraging this structural asymmetry, we formulate a novel, interpretable detection criterion grounded in paragraph-level heterogeneity. Evaluated on benchmark datasets, it improves F1-score for AI-text detection by 12.3%, establishing a verifiable, structure-aware paradigm for reliable AI-text identification.
📝 Abstract
Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) and Artificial Intelligence (AI), unlocking unprecedented capabilities. This rapid advancement has spurred research into various aspects of LLMs, their text generation&reasoning capability, and potential misuse, fueling the necessity for robust detection methods. While numerous prior research has focused on detecting LLM-generated text (AI text) and thus checkmating them, our study investigates a relatively unexplored territory: portraying the nuanced distinctions between human and AI texts across text segments. Whether LLMs struggle with or excel at incorporating linguistic ingenuity across different text segments carries substantial implications for determining their potential as effective creative assistants to humans. Through an analogy with the structure of chess games-comprising opening, middle, and end games-we analyze text segments (introduction, body, and conclusion) to determine where the most significant distinctions between human and AI texts exist. While AI texts can approximate the body segment better due to its increased length, a closer examination reveals a pronounced disparity, highlighting the importance of this segment in AI text detection. Additionally, human texts exhibit higher cross-segment differences compared to AI texts. Overall, our research can shed light on the intricacies of human-AI text distinctions, offering novel insights for text detection and understanding.