Beyond GeneGPT: A Multi-Agent Architecture with Open-Source LLMs for Enhanced Genomic Question Answering

📅 2025-11-19

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

Genomic question-answering systems face challenges of poor scalability, high privacy risks, and limited generalization due to reliance on closed-source models. To address these issues, we propose OpenBioLLM—a modular multi-agent framework built upon open-source large language models (e.g., Llama 3.1, Qwen2.5). It decouples three specialized agents—tool routing, query generation, and response verification—enabling role-based collaboration and chain-of-thought reasoning without model fine-tuning, thus supporting diverse open-weight backbones. Evaluated on Gene-Turing and GeneHop benchmarks, OpenBioLLM achieves average scores of 0.849 and 0.830, respectively—comparable to or exceeding GeneGPT (which depends on proprietary APIs and Code-davinci-002), while reducing inference latency by 40–50%. The framework significantly enhances privacy preservation, system scalability, and efficiency in integrating biomedical knowledge.

Technology Category

Application Category

📝 Abstract

Genomic question answering often requires complex reasoning and integration across diverse biomedical sources. GeneGPT addressed this challenge by combining domain-specific APIs with OpenAI's code-davinci-002 large language model to enable natural language interaction with genomic databases. However, its reliance on a proprietary model limits scalability, increases operational costs, and raises concerns about data privacy and generalization. In this work, we revisit and reproduce GeneGPT in a pilot study using open source models, including Llama 3.1, Qwen2.5, and Qwen2.5 Coder, within a monolithic architecture; this allows us to identify the limitations of this approach. Building on this foundation, we then develop OpenBioLLM, a modular multi-agent framework that extends GeneGPT by introducing agent specialization for tool routing, query generation, and response validation. This enables coordinated reasoning and role-based task execution. OpenBioLLM matches or outperforms GeneGPT on over 90% of the benchmark tasks, achieving average scores of 0.849 on Gene-Turing and 0.830 on GeneHop, while using smaller open-source models without additional fine-tuning or tool-specific pretraining. OpenBioLLM's modular multi-agent design reduces latency by 40-50% across benchmark tasks, significantly improving efficiency without compromising model capability. The results of our comprehensive evaluation highlight the potential of open-source multi-agent systems for genomic question answering. Code and resources are available at https://github.com/ielab/OpenBioLLM.

Problem

Research questions and friction points this paper is trying to address.

Genomic question answering requires complex reasoning across diverse biomedical sources

Proprietary model reliance limits scalability, increases costs and raises privacy concerns

Existing approaches lack modular design for coordinated reasoning and specialized task execution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses open-source LLMs instead of proprietary models

Implements modular multi-agent framework for specialization

Enables tool routing and response validation coordination

🔎 Similar Papers

CuriousLLM: Elevating Multi-Document Question Answering with LLM-Enhanced Knowledge Graph Reasoning