Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI

📅 2025-02-17

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This study investigates human ability to distinguish and prefer AI-generated versus human-written text across multilingual and multidomain settings. We conduct a large-scale, manually annotated evaluation involving 16 datasets, 9 languages, and 9 thematic domains, establishing a multilingual detectability assessment framework. Our results reveal— for the first time—that cross-lingual human detection accuracy reaches 87.6%, substantially challenging the prevailing assumption of indistinguishability between AI- and human-authored texts. We identify textual specificity, cultural adaptability, and expressive diversity as key discriminative dimensions. Furthermore, we propose a cognition-informed controllable generation method that significantly enhances anthropomorphism in over 50% of cases. Crucially, we demonstrate that human preference does not inherently favor human-authored text: when authorship is unknown, participants exhibit a systematic preference for AI-generated content. These findings advance both theoretical understanding of human-AI text perception and practical design of linguistically grounded, cognitively aligned generative systems.

Technology Category

Application Category

📝 Abstract

Prior studies have shown that distinguishing text generated by large language models (LLMs) from human-written one is highly challenging, and often no better than random guessing. To verify the generalizability of this finding across languages and domains, we perform an extensive case study to identify the upper bound of human detection accuracy. Across 16 datasets covering 9 languages and 9 domains, 19 annotators achieved an average detection accuracy of 87.6%, thus challenging previous conclusions. We find that major gaps between human and machine text lie in concreteness, cultural nuances, and diversity. Prompting by explicitly explaining the distinctions in the prompts can partially bridge the gaps in over 50% of the cases. However, we also find that humans do not always prefer human-written text, particularly when they cannot clearly identify its source.

Problem

Research questions and friction points this paper is trying to address.

Human detection accuracy across languages

Gaps between human and machine text

Human preference for text source ambiguity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual human detection study

Prompting bridges text gaps

Human preference varies by source

🔎 Similar Papers

Human Bias in the Face of AI: The Role of Human Judgement in AI Generated Text Evaluation