KIMAs: A Configurable Knowledge Integrated Multi-Agent System

📅 2025-02-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing open-source RAG frameworks struggle with low accuracy, inefficiency, and poor configurability when addressing key challenges in knowledge-intensive applications—namely, heterogeneous data integration, weak multi-turn dialogue context management, and stringent low-latency response requirements. To bridge this gap, we propose the Configurable Knowledge-Integrated Multi-Agent System (KIMAs), a novel architecture featuring context-aware query rewriting and dynamic knowledge routing. KIMAs incorporates a lightweight citation generation module and a parallelized multi-agent RAG pipeline, unifying retrieval-augmented generation, contextual modeling, and intelligent knowledge dispatch. The system achieves high retrieval accuracy and dialogue coherence while meeting real-time inference latency constraints. Extensive evaluation across three production-deployed knowledge applications demonstrates KIMAs’ robustness and practicality across diverse scales and application scenarios.

Technology Category

Application Category

📝 Abstract
Knowledge-intensive conversations supported by large language models (LLMs) have become one of the most popular and helpful applications that can assist people in different aspects. Many current knowledge-intensive applications are centered on retrieval-augmented generation (RAG) techniques. While many open-source RAG frameworks facilitate the development of RAG-based applications, they often fall short in handling practical scenarios complicated by heterogeneous data in topics and formats, conversational context management, and the requirement of low-latency response times. This technical report presents a configurable knowledge integrated multi-agent system, KIMAs, to address these challenges. KIMAs features a flexible and configurable system for integrating diverse knowledge sources with 1) context management and query rewrite mechanisms to improve retrieval accuracy and multi-turn conversational coherency, 2) efficient knowledge routing and retrieval, 3) simple but effective filter and reference generation mechanisms, and 4) optimized parallelizable multi-agent pipeline execution. Our work provides a scalable framework for advancing the deployment of LLMs in real-world settings. To show how KIMAs can help developers build knowledge-intensive applications with different scales and emphases, we demonstrate how we configure the system to three applications already running in practice with reliable performance.
Problem

Research questions and friction points this paper is trying to address.

Handles heterogeneous data in topics and formats
Improves retrieval accuracy and conversational coherency
Optimizes multi-agent pipeline for low-latency responses
Innovation

Methods, ideas, or system contributions that make the work stand out.

Configurable multi-agent system
Context management mechanisms
Efficient knowledge routing
🔎 Similar Papers
No similar papers found.