PRIME: Planning and Retrieval-Integrated Memory for Enhanced Reasoning

📅 2025-09-26

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the inefficiency and inaccuracy of large language models (LLMs) on multi-hop and knowledge-intensive reasoning tasks. We propose PRIME, the first multi-agent framework that formally integrates the dual-process theory of human cognition—System 1 (intuitive, rapid reasoning) and System 2 (deliberative, sequential reasoning)—into LLM-based inference architectures. PRIME orchestrates dynamically coordinated agents—including fast-response, planning, hypothesis-generation, retrieval, integration, and decision modules—to enable adaptive inter-system scheduling and closed-loop information flow. Built end-to-end atop LLaMA-3, PRIME achieves state-of-the-art performance on multiple multi-hop reasoning benchmarks: it is the first open-weight model to match GPT-4/GPT-4o in accuracy, substantially narrowing the gap with top proprietary models. Our core contribution is the first scalable, interpretable dual-process coordination mechanism, establishing a novel paradigm for efficient and robust complex reasoning.

Technology Category

Application Category

📝 Abstract

Inspired by the dual-process theory of human cognition from extit{Thinking, Fast and Slow}, we introduce extbf{PRIME} (Planning and Retrieval-Integrated Memory for Enhanced Reasoning), a multi-agent reasoning framework that dynamically integrates extbf{System 1} (fast, intuitive thinking) and extbf{System 2} (slow, deliberate thinking). PRIME first employs a Quick Thinking Agent (System 1) to generate a rapid answer; if uncertainty is detected, it then triggers a structured System 2 reasoning pipeline composed of specialized agents for extit{planning}, extit{hypothesis generation}, extit{retrieval}, extit{information integration}, and extit{decision-making}. This multi-agent design faithfully mimics human cognitive processes and enhances both efficiency and accuracy. Experimental results with LLaMA 3 models demonstrate that PRIME enables open-source LLMs to perform competitively with state-of-the-art closed-source models like GPT-4 and GPT-4o on benchmarks requiring multi-hop and knowledge-grounded reasoning. This research establishes PRIME as a scalable solution for improving LLMs in domains requiring complex, knowledge-intensive reasoning.

Problem

Research questions and friction points this paper is trying to address.

Integrates fast intuitive and slow deliberate thinking processes

Enhances reasoning efficiency and accuracy in language models

Enables competitive performance on complex knowledge-intensive tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates fast intuitive and slow deliberate thinking

Uses multi-agent pipeline for planning and retrieval

Enables open-source models to compete with closed-source ones

🔎 Similar Papers

Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning