🤖 AI Summary
This study investigates whether large language models (LLMs) exhibit systematic behavioral biases in economic and financial decision-making and proposes strategies to mitigate them. Drawing on paradigms from cognitive psychology and experimental economics, the authors systematically apply classic human bias experiments to multiple versions and scales of LLMs, analyzing their behavior in preference- and belief-based tasks. The findings reveal that larger models more closely mirror human irrationality in preference tasks, yet demonstrate greater rationality in belief tasks. Importantly, the study shows that rationality-oriented prompt engineering can significantly reduce such biases. These results uncover a nuanced relationship between model scale and decision rationality and introduce an effective prompting-based correction method to enhance the reliability of LLMs in economic reasoning contexts.
📝 Abstract
Do generative AI models, particularly large language models (LLMs), exhibit systematic behavioral biases in economic and financial decisions? If so, how can these biases be mitigated? Drawing on the cognitive psychology and experimental economics literatures, we conduct the most comprehensive set of experiments to date$-$originally designed to document human biases$-$on prominent LLM families across model versions and scales. We document systematic patterns in LLM behavior. In preference-based tasks, responses become more human-like as models become more advanced or larger, while in belief-based tasks, advanced large-scale models frequently generate rational responses. Prompting LLMs to make rational decisions reduces biases.