Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI

๐Ÿ“… 2025-02-17
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This study investigates human ability to distinguish and prefer AI-generated versus human-written text across multilingual and multidomain settings. We conduct a large-scale, manually annotated evaluation involving 16 datasets, 9 languages, and 9 thematic domains, establishing a multilingual detectability assessment framework. Our results revealโ€” for the first timeโ€”that cross-lingual human detection accuracy reaches 87.6%, substantially challenging the prevailing assumption of indistinguishability between AI- and human-authored texts. We identify textual specificity, cultural adaptability, and expressive diversity as key discriminative dimensions. Furthermore, we propose a cognition-informed controllable generation method that significantly enhances anthropomorphism in over 50% of cases. Crucially, we demonstrate that human preference does not inherently favor human-authored text: when authorship is unknown, participants exhibit a systematic preference for AI-generated content. These findings advance both theoretical understanding of human-AI text perception and practical design of linguistically grounded, cognitively aligned generative systems.

Technology Category

Application Category

๐Ÿ“ Abstract
Prior studies have shown that distinguishing text generated by large language models (LLMs) from human-written one is highly challenging, and often no better than random guessing. To verify the generalizability of this finding across languages and domains, we perform an extensive case study to identify the upper bound of human detection accuracy. Across 16 datasets covering 9 languages and 9 domains, 19 annotators achieved an average detection accuracy of 87.6%, thus challenging previous conclusions. We find that major gaps between human and machine text lie in concreteness, cultural nuances, and diversity. Prompting by explicitly explaining the distinctions in the prompts can partially bridge the gaps in over 50% of the cases. However, we also find that humans do not always prefer human-written text, particularly when they cannot clearly identify its source.
Problem

Research questions and friction points this paper is trying to address.

Human detection accuracy across languages
Gaps between human and machine text
Human preference for text source ambiguity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual human detection study
Prompting bridges text gaps
Human preference varies by source
๐Ÿ”Ž Similar Papers
Yuxia Wang
Yuxia Wang
MBZUAI
Natural Language Processing
Rui Xing
Rui Xing
University of Melbourne
Natural Language ProcessingArtificial IntelligenceDeep Learning
Jonibek Mansurov
Jonibek Mansurov
PhD student in NLP, MBZUAI
NLP
G
Giovanni Puccetti
ISTI-CNR
Zhuohan Xie
Zhuohan Xie
MBZUAI
Financial AIReasoningNatural Language ProcessingComputational LinguisticsDeep Learning
M
Minh Ngoc Ta
BKAI Research Center, Hanoi University of Science and Technology
Jiahui Geng
Jiahui Geng
Mohamed bin Zayed University of Artificial Intelligence
Artificial IntelligenceNatural Language Processing
Jinyan Su
Jinyan Su
Cornell university
LLM reasoningLLM agentretrieval augmented generation
M
Mervat Abassy
Alexandria University
S
Saad El Dine Ahmed
Alexandria University
K
Kareem Elozeiri
Zewail City of Science and Technology
Nurkhan Laiyk
Nurkhan Laiyk
MBZUAI
NLP
Maiya Goloburda
Maiya Goloburda
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)
Uncertainty QuantificationTrustworthy NLPLLM SafetyLow-resource NLP
T
Tarek Mahmoud
MBZUAI
R
Raj Vardhan Tomar
Cluster Innovation Center, University of Delhi
A
Alexander Aziz
University of Florida
R
Ryuto Koike
Institute of Science Tokyo
Masahiro Kaneko
Masahiro Kaneko
Mohamed bin Zayed University of Artificial Intelligence
Natural Language Processing
Artem Shelmanov
Artem Shelmanov
MBZUAI
uncertainty estimationfairnessactive learningnlpdeep learning
Ekaterina Artemova
Ekaterina Artemova
Toloka.AI, ex-HSE, ex-LMU
natural language processingbenchmarkinglarge language models
Vladislav Mikhailov
Vladislav Mikhailov
University of Oslo
LLMNLPbenchmarking
Akim Tsvigun
Akim Tsvigun
Senior ML Architect @ Nebius AI
Natural Language ProcessingMachine LearningActive Learning
Alham Fikri Aji
Alham Fikri Aji
MBZUAI, Monash Indonesia
MultilingualityLow-resource NLPLanguage ModelingMachine Translation
Nizar Habash
Nizar Habash
Professor of Computer Science, New York University Abu Dhabi
Natural Language ProcessingComputational LinguisticsArtificial Intelligence
Iryna Gurevych
Iryna Gurevych
Full Professor, TU Darmstadt; Adjunct Professor, MBZUAI, UAE; Affiliated Professor, INSAIT, Bulgaria
Natural Language ProcessingLarge Language ModelsArtificial Intelligence
Preslav Nakov
Preslav Nakov
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)
Computational LinguisticsLarge Language ModelsFact-checkingFake News