🤖 AI Summary
This study addresses the challenge of disentangling genuine individual differences in large language model (LLM) behavior from artifacts arising solely from global response biases or stochastic noise. To this end, we introduce and empirically validate the novel construct of “machine individuality” by applying cross-random-effects models from psychometrics to analyze 74.9 million ratings generated by 10 open-source LLMs across more than 100,000 lexical stimuli under 14 distinct psycholinguistic norms. Our findings reveal that, on average, 16.9% of the variance is attributable to stable, stimulus-specific individual differences—significantly exceeding null-model expectations. Moreover, this individuality demonstrates consistency across norms and enables cross-norm predictability, thereby establishing that LLMs exhibit a distinctive behavioral fingerprint that transcends mere response bias.
📝 Abstract
As large language models (LLMs) are increasingly integrated into daily life, in roles ranging from high-stakes decision support to companionship, understanding their behavioral dispositions becomes critical. A growing literature uses psychometric inventories and cognitive paradigms to profile LLM dispositions. However, these approaches cannot determine whether behavioral differences reflect stable, stimulus-specific individuality or global response biases and stochastic noise. Here, we apply crossed random-effects models -- widely used in psychometrics to separate systematic effects -- to 74.9 million ratings provided by 10 open-weight LLMs for over 100,000 words across 14 psycholinguistic norms. On average, 16.9% of variance is attributable to stimulus-specific individuality, robustly exceeding a statistical null model. Cross-norm prediction analyses reveal this individuality as a coherent fingerprint, unique to each model. These results identify individual differences among LLMs that cannot be attributed to response biases or stochastic noise. We term these differences machine individuality.