Using the Tools of Cognitive Science to Understand Large Language Models at Different Levels of Analysis

📅 2025-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current interpretability research on large language models (LLMs) lacks a unified theoretical framework, hindering systematic understanding of their behavioral logic and internal mechanisms. Method: We systematically adapt Marr’s three-level cognitive science framework—computational theory, algorithmic implementation, and physical realization—to LLM analysis, establishing an interdisciplinary interpretability framework. Our approach integrates cognitive modeling, behavioral experimentation, neurosymbolic interface analysis, and representation probing, yielding a reusable, cognitive-science-inspired analytical protocol. Contribution/Results: Validated across multiple mainstream LLMs, the framework enables rigorous mechanistic attribution, bias溯源 (i.e., root-cause tracing of biases), and fine-grained capability decomposition. It advances LLM understanding beyond empirical engineering heuristics toward principled, scientifically grounded modeling and explanation—bridging cognitive theory and foundation model science.

Technology Category

Application Category

📝 Abstract
Modern artificial intelligence systems, such as large language models, are increasingly powerful but also increasingly hard to understand. Recognizing this problem as analogous to the historical difficulties in understanding the human mind, we argue that methods developed in cognitive science can be useful for understanding large language models. We propose a framework for applying these methods based on Marr's three levels of analysis. By revisiting established cognitive science techniques relevant to each level and illustrating their potential to yield insights into the behavior and internal organization of large language models, we aim to provide a toolkit for making sense of these new kinds of minds.
Problem

Research questions and friction points this paper is trying to address.

Understanding large language models using cognitive science methods
Applying Marr's three levels of analysis to AI systems
Providing tools to interpret behavior and internal organization of AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

Apply cognitive science methods to AI
Use Marr's levels for model analysis
Develop toolkit for understanding AI minds
🔎 Similar Papers
No similar papers found.
Alexander Ku
Alexander Ku
Research Scientist, Google DeepMind, Princeton University
Cognitive ScienceArtificial IntelligenceMachine Learning
Declan Campbell
Declan Campbell
Graduate Student, Princeton Neuroscience Institute
Xuechunzi Bai
Xuechunzi Bai
Assistant Professor of Psychology, University of Chicago
social psychologycognitive sciencecomputational methodsstereotype
Jiayi Geng
Jiayi Geng
Carnegie Mellon University
Natural Language ProcessingMachine LearningCognitive Science
Ryan Liu
Ryan Liu
PhD Student in Computer Science, Princeton University
Large Language ModelsComputational Cognitive ScienceNLP Applications
Raja Marjieh
Raja Marjieh
PhD Candidate, Princeton University
Cognitive SciencePerceptionMachine LearningStatistical Physics
R. Thomas McCoy
R. Thomas McCoy
Assistant Professor of Linguistics, Yale University
Computational LinguisticsLinguisticsCognitive Science
A
Andrew Nam
Princeton Laboratory for Artificial Intelligence, Princeton University
Ilia Sucholutsky
Ilia Sucholutsky
New York University
deep learningrepresentation learningsmall datarepresentational alignmentAI/ML
Veniamin Veselovsky
Veniamin Veselovsky
PhD student at Princeton CS
nlpcomputational social scienceai agents
Liyi Zhang
Liyi Zhang
Princeton University
Machine LearningBayesian StatisticsBayesian Deep LearningApproximate Inference
Jian-Qiao Zhu
Jian-Qiao Zhu
Department of Computer Science, Princeton University
Cognitive ScienceBehavioral ScienceMachine Learning
Thomas L. Griffiths
Thomas L. Griffiths
Professor of Psychology and Computer Science, Princeton University
Computational Models of CognitionCognitive ScienceMachine LearningCognitive PsychologyBayesian Statistics