LLMs generate structurally realistic social networks but overestimate political homophily

📅 2024-08-29
🏛️ arXiv.org
📈 Citations: 4
Influential: 1
📄 PDF
🤖 AI Summary
Prior work lacks quantitative evaluation of the topological plausibility and latent biases—particularly political homophily—in large language model (LLM)-generated social networks. Method: We propose a localized, role-by-role edge-generation prompting strategy (contrasted with global prompting), combined with zero-shot inference (using LLaMA and GPT series models) and rigorous social network metrics—including density, clustering coefficient, degree distribution, and homophily measures—to systematically assess structural realism and bias. Contribution/Results: Localized prompting significantly improves topological fidelity, bringing generated networks close to empirical benchmarks across multiple structural metrics. However, all models exhibit systematic overestimation of political homophily—by a factor of 2–5 relative to ground truth—exceeding biases observed along age, gender, or education dimensions. To our knowledge, this is the first empirical study to jointly examine structural realism and political bias in LLM-generated social networks.

Technology Category

Application Category

📝 Abstract
Generating social networks is essential for many applications, such as epidemic modeling and social simulations. The emergence of generative AI, especially large language models (LLMs), offers new possibilities for social network generation: LLMs can generate networks without additional training or need to define network parameters, and users can flexibly define individuals in the network using natural language. However, this potential raises two critical questions: 1) are the social networks generated by LLMs realistic, and 2) what are risks of bias, given the importance of demographics in forming social ties? To answer these questions, we develop three prompting methods for network generation and compare the generated networks to a suite of real social networks. We find that more realistic networks are generated with"local"methods, where the LLM constructs relations for one persona at a time, compared to"global"methods that construct the entire network at once. We also find that the generated networks match real networks on many characteristics, including density, clustering, connectivity, and degree distribution. However, we find that LLMs emphasize political homophily over all other types of homophily and significantly overestimate political homophily compared to real social networks.
Problem

Research questions and friction points this paper is trying to address.

Assessing realism of LLM-generated social networks
Evaluating bias risks in demographic-based social ties
Comparing political homophily in LLM vs real networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs generate networks without training or parameters
Local prompting methods create more realistic networks
Networks match real ones in density, clustering, connectivity
🔎 Similar Papers
No similar papers found.
S
Serina Chang
Department of Computer Science, Stanford University
A
Alicja Chaszczewicz
Department of Computer Science, Stanford University
E
Emma Wang
Department of Computer Science, Stanford University
M
Maya Josifovska
Department of Computer Science, University of California, Los Angeles
Emma Pierson
Emma Pierson
University of California, Berkeley
Machine learningStatisticsData scienceHealthcareInequality
J
J. Leskovec
Department of Computer Science, Stanford University