A Comparative Approach to Assessing Linguistic Creativity of Large Language Models and Humans

📅 2025-07-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the lack of systematic, comparable evaluation frameworks for assessing differences between large language models (LLMs) and humans in linguistic creativity—specifically in word formation (derivation and compounding) and metaphor generation. Method: We propose a novel, general-purpose test framework comprising three original task types, evaluated automatically via the OCSAI toolkit along three dimensions: originality, granularity, and flexibility; results are validated through human annotation. Contribution/Results: This work presents the first empirical, multidimensional comparison of LLMs and humans on linguistic creativity. Across eight subtasks, LLMs significantly outperform humans on six, and lead across all three metrics overall. Crucially, we identify fundamental cognitive divergences: humans favor expansive, context-sensitive creation, whereas LLMs excel at pattern-based, structurally stable generation. These findings challenge the prevailing assumption of irreplaceable human creativity and establish a new benchmark for modeling linguistic creativity.

Technology Category

Application Category

📝 Abstract
The following paper introduces a general linguistic creativity test for humans and Large Language Models (LLMs). The test consists of various tasks aimed at assessing their ability to generate new original words and phrases based on word formation processes (derivation and compounding) and on metaphorical language use. We administered the test to 24 humans and to an equal number of LLMs, and we automatically evaluated their answers using OCSAI tool for three criteria: Originality, Elaboration, and Flexibility. The results show that LLMs not only outperformed humans in all the assessed criteria, but did better in six out of the eight test tasks. We then computed the uniqueness of the individual answers, which showed some minor differences between humans and LLMs. Finally, we performed a short manual analysis of the dataset, which revealed that humans are more inclined towards E(extending)-creativity, while LLMs favor F(ixed)-creativity.
Problem

Research questions and friction points this paper is trying to address.

Assessing linguistic creativity in LLMs and humans
Comparing originality in word and phrase generation
Evaluating differences in creativity types between humans and LLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

General linguistic creativity test for humans and LLMs
Automated evaluation using OCSAI tool for three criteria
Comparative analysis of E-creativity and F-creativity tendencies
🔎 Similar Papers
No similar papers found.
Anca Dinu
Anca Dinu
University of Bucharest
computational linguisticsnatural language semanticsmachine learning
A
Andra-Maria Florescu
School of Computing and Information, University of Bucharest, Bucharest, Romania
A
Alina Resceanu
School of Computing and Information, University of Craiova, Craiova, Romania