PRIME: Planning and Retrieval-Integrated Memory for Enhanced Reasoning

📅 2025-09-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiency and inaccuracy of large language models (LLMs) on multi-hop and knowledge-intensive reasoning tasks. We propose PRIME, the first multi-agent framework that formally integrates the dual-process theory of human cognition—System 1 (intuitive, rapid reasoning) and System 2 (deliberative, sequential reasoning)—into LLM-based inference architectures. PRIME orchestrates dynamically coordinated agents—including fast-response, planning, hypothesis-generation, retrieval, integration, and decision modules—to enable adaptive inter-system scheduling and closed-loop information flow. Built end-to-end atop LLaMA-3, PRIME achieves state-of-the-art performance on multiple multi-hop reasoning benchmarks: it is the first open-weight model to match GPT-4/GPT-4o in accuracy, substantially narrowing the gap with top proprietary models. Our core contribution is the first scalable, interpretable dual-process coordination mechanism, establishing a novel paradigm for efficient and robust complex reasoning.

Technology Category

Application Category

📝 Abstract
Inspired by the dual-process theory of human cognition from extit{Thinking, Fast and Slow}, we introduce extbf{PRIME} (Planning and Retrieval-Integrated Memory for Enhanced Reasoning), a multi-agent reasoning framework that dynamically integrates extbf{System 1} (fast, intuitive thinking) and extbf{System 2} (slow, deliberate thinking). PRIME first employs a Quick Thinking Agent (System 1) to generate a rapid answer; if uncertainty is detected, it then triggers a structured System 2 reasoning pipeline composed of specialized agents for extit{planning}, extit{hypothesis generation}, extit{retrieval}, extit{information integration}, and extit{decision-making}. This multi-agent design faithfully mimics human cognitive processes and enhances both efficiency and accuracy. Experimental results with LLaMA 3 models demonstrate that PRIME enables open-source LLMs to perform competitively with state-of-the-art closed-source models like GPT-4 and GPT-4o on benchmarks requiring multi-hop and knowledge-grounded reasoning. This research establishes PRIME as a scalable solution for improving LLMs in domains requiring complex, knowledge-intensive reasoning.
Problem

Research questions and friction points this paper is trying to address.

Integrates fast intuitive and slow deliberate thinking processes
Enhances reasoning efficiency and accuracy in language models
Enables competitive performance on complex knowledge-intensive tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates fast intuitive and slow deliberate thinking
Uses multi-agent pipeline for planning and retrieval
Enables open-source models to compete with closed-source ones
🔎 Similar Papers
No similar papers found.
Hieu Tran
Hieu Tran
University of Maryland, College Park
Natural Language ProcessingLarge Language Models
Zonghai Yao
Zonghai Yao
Umass Amherst
Medical-LLMMulti-agent AI HospitalClinical ReasoningSynthetic DataPatient Education
N
Nguyen Luong Tran
Manning College of Information and Computer Sciences, University of Massachusetts Amherst, MA, USA
Z
Zhichao Yang
Manning College of Information and Computer Sciences, University of Massachusetts Amherst, MA, USA
Feiyun Ouyang
Feiyun Ouyang
PostDoc of Umass Lowell
Public HealthComputer ScienceNLPEpidemiology
S
Shuo Han
Miner School of Computer and Information Sciences, University of Massachusetts Lowell, MA, USA
R
Razieh Rahimi
Manning College of Information and Computer Sciences, University of Massachusetts Amherst, MA, USA
H
Hong Yu
Manning College of Information and Computer Sciences, University of Massachusetts Amherst, MA, USA; Miner School of Computer and Information Sciences, University of Massachusetts Lowell, MA, USA