π€ AI Summary
To address low accuracy in user queryβFAQ matching caused by semantic ambiguity and linguistic diversity, this paper proposes a multi-agent framework based on Augmented Reasoning Queries (ARQs). The framework comprises three synergistic components: a task-specific JSON-formatted query agent, a cluster of specialized agents driven by differentiated few-shot prompting, and a dynamic re-ranker agent that adjudicates relevance. It introduces the first end-to-end mechanism integrating structured reasoning, multi-agent collaboration, and ensemble-based re-ranking. Evaluated on a real-world banking dataset, it achieves +14% Top-1 and +18% Top-5 accuracy gains, with a +12% improvement in Mean Reciprocal Rank (MRR); it also significantly outperforms single-agent baselines on LCQMC and FiQA benchmarks. Key contributions include: (1) an ARQs-driven, interpretable reasoning paradigm; (2) a differentiated few-shot multi-agent ensemble strategy; and (3) a task-aware, dynamic adjudication-based re-ranking mechanism.
π Abstract
Modern applications require accurate and efficient retrieval of information in response to user queries. Mapping user utterances to the most relevant Frequently Asked Questions (FAQs) is a crucial component of these systems. Traditional approaches often rely on a single model or technique, which may not capture the nuances of diverse user inquiries. In this paper, we introduce a multi-agent framework for FAQ annotation that combines multiple specialized agents with different approaches and a judge agent that reranks candidates to produce optimal results. Our agents utilize a structured reasoning approach inspired by Attentive Reasoning Queries (ARQs), which guides them through systematic reasoning steps using targeted, task-specific JSON queries. Our framework features a specialized few-shot example strategy, where each agent receives different few-shots, enhancing ensemble diversity and coverage of the query space. We evaluate our framework on a real-world banking dataset as well as public benchmark datasets (LCQMC and FiQA), demonstrating significant improvements over single-agent approaches across multiple metrics, including a 14% increase in Top-1 accuracy, an 18% increase in Top-5 accuracy, and a 12% improvement in Mean Reciprocal Rank on our dataset, and similar gains on public benchmarks when compared with traditional single agent annotation techniques. Our framework is particularly effective at handling ambiguous queries, making it well-suited for deployment in production applications while showing strong generalization capabilities across different domains and languages.