Is your LLM trapped in a Mental Set? Investigative study on how mental sets affect the reasoning capabilities of LLMs

📅 2025-01-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the underexplored role of cognitive rigidity—specifically, “mental set”—in large language models (LLMs), investigating its impact on strategy switching and adaptive reasoning. Method: We systematically introduce mental set concepts from cognitive psychology into LLM evaluation, constructing a novel mental-set-inducing benchmark. Using parameter-efficient fine-tuning (PEFT), in-context learning (ICL), and cross-model analysis (Llama-3.1-8B/70B, GPT-4o), we evaluate performance on strategy-reversal tasks. Contribution/Results: We observe significant performance degradation (23–41%) across mainstream LLMs on reversal tasks, confirming entrenched pattern dependence. Critically, we move beyond static benchmarks (e.g., MMLU, GSM8K) to propose the first dynamic evaluation framework explicitly targeting cognitive flexibility in reasoning. This framework provides a new paradigm for diagnosing LLM reasoning bottlenecks and advancing generalization capabilities through adaptive inference.

Technology Category

Application Category

📝 Abstract
In this paper, we present an investigative study on how Mental Sets influence the reasoning capabilities of LLMs. LLMs have excelled in diverse natural language processing (NLP) tasks, driven by advancements in parameter-efficient fine-tuning (PEFT) and emergent capabilities like in-context learning (ICL). For complex reasoning tasks, selecting the right model for PEFT or ICL is critical, often relying on scores on benchmarks such as MMLU, MATH, and GSM8K. However, current evaluation methods, based on metrics like F1 Score or reasoning chain assessments by larger models, overlook a key dimension: adaptability to unfamiliar situations and overcoming entrenched thinking patterns. In cognitive psychology, Mental Set refers to the tendency to persist with previously successful strategies, even when they become inefficient - a challenge for problem solving and reasoning. We compare the performance of LLM models like Llama-3.1-8B-Instruct, Llama-3.1-70B-Instruct and GPT-4o in the presence of mental sets. To the best of our knowledge, this is the first study to integrate cognitive psychology concepts into the evaluation of LLMs for complex reasoning tasks, providing deeper insights into their adaptability and problem-solving efficacy.
Problem

Research questions and friction points this paper is trying to address.

Psychological Set
Large Language Models
Complex Reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Psychological Priming
Large Language Models
Complex Reasoning Tasks
🔎 Similar Papers
No similar papers found.