Heuristics and Biases in AI Decision-Making: Implications for Responsible AGI

📅 2024-09-26
📈 Citations: 0
Influential: 0
📄 PDF

career value

191K/year
🤖 AI Summary
This study addresses pervasive cognitive biases in large language models’ (LLMs’) decision-making. We systematically evaluate GPT-4o, Gemma 2, and Llama 3.1 across nine canonical cognitive biases using a standardized psychological experimental paradigm. Our methodology features a novel, scalable prompt template and a tripartite evaluation framework integrating response consistency scoring, logical contradiction detection, and bias strength quantification—based on 1,500 controlled trials, the first cross-model, multi-bias, large-scale quantitative analysis of its kind. Results show GPT-4o exhibits superior overall robustness; Gemma 2 outperforms others on sunk-cost and prospect-theory tasks; Llama 3.1 frequently generates logically inconsistent responses, confirming its non-robust reasoning. Innovatively, we jointly define statistical reasoning competence and ethical alignment as a new benchmark for AGI robustness. This work establishes a reproducible methodological foundation and empirical evidence base for developing trustworthy AI systems.

Technology Category

Application Category

📝 Abstract
We investigate the presence of cognitive biases in three large language models (LLMs): GPT-4o, Gemma 2, and Llama 3.1. The study uses 1,500 experiments across nine established cognitive biases to evaluate the models' responses and consistency. GPT-4o demonstrated the strongest overall performance. Gemma 2 showed strengths in addressing the sunk cost fallacy and prospect theory, however its performance varied across different biases. Llama 3.1 consistently underperformed, relying on heuristics and exhibiting frequent inconsistencies and contradictions. The findings highlight the challenges of achieving robust and generalizable reasoning in LLMs, and underscore the need for further development to mitigate biases in artificial general intelligence (AGI). The study emphasizes the importance of integrating statistical reasoning and ethical considerations in future AI development.
Problem

Research questions and friction points this paper is trying to address.

Investigates cognitive biases in GPT-4o, Gemma 2, and Llama 3.1.
Highlights challenges in achieving robust reasoning in large language models.
Emphasizes need for bias mitigation in artificial general intelligence.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates cognitive biases in LLMs using experiments
Highlights GPT-4o's superior performance in bias mitigation
Emphasizes integrating statistical reasoning and ethics in AI
🔎 Similar Papers