🤖 AI Summary
To address factual inaccuracy of large language models (LLMs) in healthcare prediction, this paper tackles two core challenges: determining optimal retrieval timing and achieving joint optimization between retriever and generator. We propose a Hierarchical Agent Retrieval (HAR) framework featuring a dual-agent architecture—comprising a retrieval agent and a generation agent—unified under a shared Markov Decision Process (MDP) to model their collaborative decision-making and enable experience-augmented, dynamic knowledge invocation. Technically, HAR integrates Retrieval-Augmented Generation (RAG), GraphRAG, hierarchical agent systems, and reinforcement learning. Evaluated on three mainstream healthcare benchmark tasks, HAR significantly outperforms state-of-the-art methods, demonstrating substantial improvements in prediction accuracy, knowledge adaptability, and inter-module coordination efficiency.
📝 Abstract
Accurate healthcare prediction is critical for improving patient outcomes and reducing operational costs. Bolstered by growing reasoning capabilities, large language models (LLMs) offer a promising path to enhance healthcare predictions by drawing on their rich parametric knowledge. However, LLMs are prone to factual inaccuracies due to limitations in the reliability and coverage of their embedded knowledge. While retrieval-augmented generation (RAG) frameworks, such as GraphRAG and its variants, have been proposed to mitigate these issues by incorporating external knowledge, they face two key challenges in the healthcare scenario: (1) identifying the clinical necessity to activate the retrieval mechanism, and (2) achieving synergy between the retriever and the generator to craft contextually appropriate retrievals. To address these challenges, we propose GHAR, a underline{g}enerative underline{h}ierarchical underline{a}gentic underline{R}AG framework that simultaneously resolves when to retrieve and how to optimize the collaboration between submodules in healthcare. Specifically, for the first challenge, we design a dual-agent architecture comprising Agent-Top and Agent-Low. Agent-Top acts as the primary physician, iteratively deciding whether to rely on parametric knowledge or to initiate retrieval, while Agent-Low acts as the consulting service, summarising all task-relevant knowledge once retrieval was triggered. To tackle the second challenge, we innovatively unify the optimization of both agents within a formal Markov Decision Process, designing diverse rewards to align their shared goal of accurate prediction while preserving their distinct roles. Extensive experiments on three benchmark datasets across three popular tasks demonstrate our superiority over state-of-the-art baselines, highlighting the potential of hierarchical agentic RAG in advancing healthcare systems.