🤖 AI Summary
This study investigates systematic biases in large language models (LLMs) when simulating human susceptibility to misinformation, revealing their limited ability to accurately replicate real-world belief formation and sharing behaviors. By prompting LLMs with individual-level characteristics from actual survey data and integrating survey simulation, linear regression, causal inference, and training data provenance analysis, the authors demonstrate that LLMs consistently overestimate the influence of attitudinal factors on beliefs and sharing while neglecting the role of social network structures. Although the models capture aggregate distributional trends, their excessive reliance on attitudinal cues and omission of structural social determinants lead to inflated explained variance and undermine behavioral realism. These findings highlight critical limitations in current LLM-based simulation approaches for modeling complex socio-cognitive phenomena.
📝 Abstract
Large language models (LLMs) are increasingly used as proxies for human judgment in computational social science, yet their ability to reproduce patterns of susceptibility to misinformation remains unclear. We test whether LLM-simulated survey respondents, prompted with participant profiles drawn from social survey data measuring network, demographic, attitudinal and behavioral features, can reproduce human patterns of misinformation belief and sharing. Using three online surveys as baselines, we evaluate whether LLM outputs match observed response distributions and recover feature-outcome associations present in the original survey data. LLM-generated responses capture broad distributional tendencies and show modest correlation with human responses, but consistently overstate the association between belief and sharing. Linear models fit to simulated responses exhibit substantially higher explained variance and place disproportionate weight on attitudinal and behavioral features, while largely ignoring personal network characteristics, relative to models fit to human responses. Analyses of model-generated reasoning and LLM training data suggest that these distortions reflect systematic biases in how misinformation-related concepts are represented. Our findings suggest that LLM-based survey simulations are better suited for diagnosing systematic divergences from human judgment than for substituting it.