🤖 AI Summary
This study investigates whether large language models (LLMs) implicitly encode and can reproduce cross-cultural personality differences. Method: Using zero-shot and few-shot prompting with the multilingual Big Five Inventory-2 (BFI-2), we evaluated personality trait outputs of GPT-4, LLaMA, and other LLMs across five cultural groups—including the U.S. and South Korea—and assessed reproducibility via intraclass correlation (ICC) and statistical modeling. Results/Contribution: We provide the first systematic evidence that LLMs significantly replicate nation-level personality differences—particularly in Agreeableness and Conscientiousness—with high fidelity (r = 0.72–0.89), closely aligning with established human cross-cultural findings. Crucially, our analysis demonstrates that LLM training data contain measurable, transferable cultural biases in personality structure, establishing a novel paradigm and empirical benchmark for cultural alignment and bias auditing in LLMs.