Large Language Models are Near-Optimal Decision-Makers with a Non-Human Learning Behavior

📅 2025-06-19

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This study investigates how large language models (LLMs) navigate real-world decision-making under uncertainty, risk, and task switching—three core dimensions where human judgment is well characterized but LLM behavior remains poorly understood. Method: We administered canonical cognitive psychology paradigms—including gambling tasks, probabilistic reasoning, and the Wisconsin Card Sorting Test—to five state-of-the-art LLMs and 360 human participants, using zero-shot and few-shot evaluation protocols. Contribution/Results: For the first time, we empirically demonstrate that all five LLMs significantly outperform humans across all three domains, approaching Bayesian-optimal performance. However, their decision processes lack human-like heuristics and metacognitive regulation, exhibiting instead non-anthropomorphic learning and inference mechanisms. These findings reveal a fundamental tension between high behavioral competence and low interpretability in LLMs, challenging assumptions about their cognitive alignment with humans. The results provide both theoretical caution and empirical grounding for the cautious deployment of LLMs in high-stakes decision-making contexts.

Technology Category

Application Category

📝 Abstract

Human decision-making belongs to the foundation of our society and civilization, but we are on the verge of a future where much of it will be delegated to artificial intelligence. The arrival of Large Language Models (LLMs) has transformed the nature and scope of AI-supported decision-making; however, the process by which they learn to make decisions, compared to humans, remains poorly understood. In this study, we examined the decision-making behavior of five leading LLMs across three core dimensions of real-world decision-making: uncertainty, risk, and set-shifting. Using three well-established experimental psychology tasks designed to probe these dimensions, we benchmarked LLMs against 360 newly recruited human participants. Across all tasks, LLMs often outperformed humans, approaching near-optimal performance. Moreover, the processes underlying their decisions diverged fundamentally from those of humans. On the one hand, our finding demonstrates the ability of LLMs to manage uncertainty, calibrate risk, and adapt to changes. On the other hand, this disparity highlights the risks of relying on them as substitutes for human judgment, calling for further inquiry.

Problem

Research questions and friction points this paper is trying to address.

Understanding how LLMs learn decision-making compared to humans

Assessing LLM performance in uncertainty, risk, and set-shifting tasks

Evaluating risks of using LLMs as substitutes for human judgment

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs benchmarked using experimental psychology tasks

LLMs outperform humans in decision-making tasks

LLMs exhibit non-human learning behavior

🔎 Similar Papers

Efficient Sequential Decision Making with Large Language Models