Individual differences in the cognitive mechanisms of planning strategy discovery

📅 2025-05-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing metacognitive reinforcement learning (MCRL) models exhibit significantly slower strategy discovery than humans in planning tasks, yet the cognitive mechanisms underlying human efficiency—and their individual variability—remain poorly understood. Method: We conducted behavioral experiments and computational modeling to systematically test three novel cognitive mechanisms hypothesized to accelerate strategy discovery: endogenous pseudo-rewards, subjective effort valuation, and termination trade-offs. Contribution/Results: All three mechanisms significantly enhanced learning efficiency; however, their adoption exhibited pronounced inter-individual heterogeneity, with most participants employing at least one. Through individual-level model fitting and interpretable parameter analysis, we provide the first quantitative characterization of their differential contributions. This work identifies critical cognitive constraints—namely, adaptive reward shaping, effort-cost calibration, and flexible policy termination—that bridge the performance gap between MCRL models and human strategy discovery, thereby informing more biologically grounded and cognitively plausible computational architectures.

Technology Category

Application Category

📝 Abstract
People employ efficient planning strategies. But how are these strategies acquired? Previous research suggests that people can discover new planning strategies through learning from reinforcements, a process known as metacognitive reinforcement learning (MCRL). While prior work has shown that MCRL models can learn new planning strategies and explain more participants' experience-driven discovery better than alternative mechanisms, it also revealed significant individual differences in metacognitive learning. Furthermore, when fitted to human data, these models exhibit a slower rate of strategy discovery than humans. In this study, we investigate whether incorporating cognitive mechanisms that might facilitate human strategy discovery can bring models of MCRL closer to human performance. Specifically, we consider intrinsically generated metacognitive pseudo-rewards, subjective effort valuation, and termination deliberation. Analysis of planning task data shows that a larger proportion of participants used at least one of these mechanisms, with significant individual differences in their usage and varying impacts on strategy discovery. Metacognitive pseudo-rewards, subjective effort valuation, and learning the value of acting without further planning were found to facilitate strategy discovery. While these enhancements provided valuable insights into individual differences and the effect of these mechanisms on strategy discovery, they did not fully close the gap between model and human performance, prompting further exploration of additional factors that people might use to discover new planning strategies.
Problem

Research questions and friction points this paper is trying to address.

Investigating cognitive mechanisms for human planning strategy discovery
Exploring individual differences in metacognitive reinforcement learning
Bridging gap between model and human strategy discovery performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Metacognitive pseudo-rewards enhance strategy discovery
Subjective effort valuation improves planning strategies
Termination deliberation accelerates learning process
R
Ruiqi He
Max Planck Institute for Intelligent Systems, Tuebingen, BW 72076
Falk Lieder
Falk Lieder
UCLA, Asst. Prof. of Psychology
bounded rationalitymoral psychologydecision makingmoral learning