🤖 AI Summary
This study investigates the reproducibility and generalizability of large language models (LLMs) in empirical marketing research, focusing on AI-assisted replication of information effects. Method: We introduce the novel “AI-personified participant” paradigm, systematically evaluating LLM behavioral simulation across 133 experimental findings from 14 top-tier journal articles. Using prompt engineering, automated experimental design, and statistical consistency assessment—comparing main and interaction effects—we rigorously probe the boundaries of LLM fidelity. Contribution/Results: LLMs successfully reproduce 76% (84/111) of main effects and 68% (90/133) of all reported effects, generating a high-quality dataset of 19,447 AI participants. This enables rapid message testing and iterative theory refinement. To our knowledge, this is the first work to deploy LLMs as controllable, scalable behavioral agents for reproducibility studies in the social sciences, empirically illuminating their promise—and limitations—in causal inference and theoretical generalization.
📝 Abstract
This report analyzes the potential for large language models (LLMs) to expedite accurate replication and generalization of published research about message effects in marketing. LLM-powered participants (personas) were tested by replicating 133 experimental findings from 14 papers containing 45 recent studies published in the Journal of Marketing. For each study, the measures, stimuli, and sampling specifications were used to generate prompts for LLMs to act as unique personas. The AI personas, 19,447 in total across all of the studies, generated complete datasets and statistical analyses were then compared with the original human study results. The LLM replications successfully reproduced 76% of the original main effects (84 out of 111), demonstrating strong potential for AI-assisted replication. The overall replication rate including interaction effects was 68% (90 out of 133). Furthermore, a test of how human results generalized to different participant samples, media stimuli, and measures showed that replication results can change when tests go beyond the parameters of the original human studies. Implications are discussed for the replication and generalizability crises in social science, the acceleration of theory building in media and marketing psychology, and the practical advantages of rapid message testing for consumer products. Limitations of AI replications are addressed with respect to complex interaction effects, biases in AI models, and establishing benchmarks for AI metrics in marketing research.