🤖 AI Summary
This work addresses the measurability, malleability, and credibility of synthetic personality in large language models (LLMs). We propose the first personality assessment and directed shaping framework with established reliability and validity. Methodologically, we design a prompt-driven evaluation protocol grounded in classical personality theories—particularly the Big Five—and integrate statistical validation with controllable text generation techniques. We introduce the first standardized criteria for evaluating the reliability and validity of LLM-generated personality outputs, and systematically investigate how model scale and instruction fine-tuning affect personality consistency. Results demonstrate that mainstream LLMs exhibit stable, reproducible personality expression under specific prompting conditions, enabling cross-dimensional, high-precision personality profiling and modulation. This work establishes a theoretical foundation and technical infrastructure for the scientific development and ethical governance of personalized dialogue agents.
📝 Abstract
The advent of large language models (LLMs) has revolutionized natural language processing, enabling the generation of coherent and contextually relevant human-like text. As LLMs increasingly power conversational agents used by the general public world-wide, the synthetic personality embedded in these models, by virtue of training on large amounts of human data, is becoming increasingly important. Since personality is a key factor determining the effectiveness of communication, we present a comprehensive method for administering and validating personality tests on widely-used LLMs, as well as for shaping personality in the generated text of such LLMs. Applying this method, we found: 1) personality measurements in the outputs of some LLMs under specific prompting configurations are reliable and valid; 2) evidence of reliability and validity of synthetic LLM personality is stronger for larger and instruction fine-tuned models; and 3) personality in LLM outputs can be shaped along desired dimensions to mimic specific human personality profiles. We discuss application and ethical implications of the measurement and shaping method, in particular regarding responsible AI.