SEMA-RAG: A Self-Evolving Multi-Agent Retrieval-Augmented Generation Framework for Medical Reasoning

📅 2026-05-16

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Traditional single-turn static retrieval-augmented generation (RAG) struggles to align with the multi-stage nature of clinical reasoning, often resulting in insufficient semantic understanding and weak evidence chain construction. This work proposes a self-evolving multi-agent RAG framework that, for the first time, decomposes clinical reasoning into three specialized agent roles: explanation, exploration, and arbitration. By integrating semantic-driven question parsing, dynamic multi-turn retrieval, and an evidence adjudication mechanism, the framework enables synergistic iteration between retrieval and reasoning. Extensive experiments across five medical question-answering benchmarks and five large language models demonstrate that the proposed method improves average accuracy by 6.46 percentage points over the strongest baseline.

📝 Abstract

Retrieval-Augmented Generation (RAG) is widely employed to mitigate risks such as hallucinations and knowledge obsolescence in medical question answering, yet its predominantly single-round, static retrieval paradigm misaligns with the multi-stage process of clinical reasoning. This compressed workflow induces two structural deficiencies: question-to-query translation often lacks clinically grounded semantic interpretation, and retrieval lacks iterative sufficiency feedback, making it difficult to form reliable evidence chains. We argue that both issues stem from a deeper cause: overloading a single reasoning chain with heterogeneous tasks of interpretation, exploration, and adjudication. The remedy is to reconstruct the workflow via task decoupling and dynamic multi-round exploration. To this end, we propose SEMA-RAG, a Self-Evolving Multi-Agent RAG framework for medical question answering, which assigns these roles to three specialist agents: the Interpreter Agent for clinical schema interpretation, the Explorer Agent for sufficiency-driven self-evolving retrieval, and the Arbiter Agent for evidence adjudication and answer selection. Across five benchmarks and five LLM backbones, SEMA-RAG improves the strongest baseline by +6.46 accuracy points on average, measured per backbone.

Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation

medical reasoning

clinical question answering

evidence chain

multi-round retrieval

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Agent RAG

Self-Evolving Retrieval

Medical Reasoning