A Comparative Approach to Assessing Linguistic Creativity of Large Language Models and Humans

📅 2025-07-16

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This study addresses the lack of systematic, comparable evaluation frameworks for assessing differences between large language models (LLMs) and humans in linguistic creativity—specifically in word formation (derivation and compounding) and metaphor generation. Method: We propose a novel, general-purpose test framework comprising three original task types, evaluated automatically via the OCSAI toolkit along three dimensions: originality, granularity, and flexibility; results are validated through human annotation. Contribution/Results: This work presents the first empirical, multidimensional comparison of LLMs and humans on linguistic creativity. Across eight subtasks, LLMs significantly outperform humans on six, and lead across all three metrics overall. Crucially, we identify fundamental cognitive divergences: humans favor expansive, context-sensitive creation, whereas LLMs excel at pattern-based, structurally stable generation. These findings challenge the prevailing assumption of irreplaceable human creativity and establish a new benchmark for modeling linguistic creativity.

Technology Category

Application Category

📝 Abstract

The following paper introduces a general linguistic creativity test for humans and Large Language Models (LLMs). The test consists of various tasks aimed at assessing their ability to generate new original words and phrases based on word formation processes (derivation and compounding) and on metaphorical language use. We administered the test to 24 humans and to an equal number of LLMs, and we automatically evaluated their answers using OCSAI tool for three criteria: Originality, Elaboration, and Flexibility. The results show that LLMs not only outperformed humans in all the assessed criteria, but did better in six out of the eight test tasks. We then computed the uniqueness of the individual answers, which showed some minor differences between humans and LLMs. Finally, we performed a short manual analysis of the dataset, which revealed that humans are more inclined towards E(extending)-creativity, while LLMs favor F(ixed)-creativity.

Problem

Research questions and friction points this paper is trying to address.

Assessing linguistic creativity in LLMs and humans

Comparing originality in word and phrase generation

Evaluating differences in creativity types between humans and LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

General linguistic creativity test for humans and LLMs

Automated evaluation using OCSAI tool for three criteria

Comparative analysis of E-creativity and F-creativity tendencies

🔎 Similar Papers

Divergent Creativity in Humans and Large Language Models