Whose LLM is it Anyway? Linguistic Comparison and LLM Attribution for GPT-3.5, GPT-4 and Bard

📅 2024-02-22
🏛️ arXiv.org
📈 Citations: 10
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether large language models (LLMs) exhibit identifiable, stable, and individualized linguistic styles enabling automatic model attribution. Focusing on GPT-3.5, GPT-4, and Bard, we propose a lightweight supervised classification framework—employing SVM and Random Forest—based on multidimensional linguistic features: part-of-speech distributions, dependency structures, sentiment polarity, and lexical statistics. We systematically demonstrate, for the first time, that LLMs possess robust, task- and prompt-invariant linguistic “fingerprints.” Extensive experiments achieve an average attribution accuracy of 88% across diverse prompts and tasks. Our work confirms that distinct LLMs exhibit significant, consistent fine-grained stylistic differences and establishes the first interpretable, high-accuracy, low-computational-overhead paradigm for LLM provenance identification—providing critical technical support for model regulation, content溯源, and accountability assignment.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are capable of generating text that is similar to or surpasses human quality. However, it is unclear whether LLMs tend to exhibit distinctive linguistic styles akin to how human authors do. Through a comprehensive linguistic analysis, we compare the vocabulary, Part-Of-Speech (POS) distribution, dependency distribution, and sentiment of texts generated by three of the most popular LLMS today (GPT-3.5, GPT-4, and Bard) to diverse inputs. The results point to significant linguistic variations which, in turn, enable us to attribute a given text to its LLM origin with a favorable 88% accuracy using a simple off-the-shelf classification model. Theoretical and practical implications of this intriguing finding are discussed.
Problem

Research questions and friction points this paper is trying to address.

Identifying distinctive linguistic styles of different LLMs
Comparing vocabulary, POS, dependency and sentiment across models
Attributing generated text to specific LLM sources accurately
Innovation

Methods, ideas, or system contributions that make the work stand out.

Linguistic analysis of vocabulary, POS, dependencies, sentiment
Comparison across GPT-3.5, GPT-4, and Bard outputs
Off-the-shelf classifier achieves 88% attribution accuracy
Ariel Rosenfeld
Ariel Rosenfeld
Associate Professor at Bar-Ilan University
Artificial IntelligenceHuman-Agent InteractionAI for Social GoodScientometrics
T
T. Lazebnik
Department of Mathematics, Ariel University, Ariel, Israel; Department of Cancer Biology, Cancer Institute, University College London, London, UK