🤖 AI Summary
In retrieval-augmented generation (RAG), fixed-weight fusion of dense and sparse (BM25) retrieval degrades query adaptability. To address this, we propose a dynamic alpha tuning framework that leverages an LLM-driven, single-result effectiveness assessment mechanism to generate optimal fusion weights between dense and sparse retrievers at the query level. Departing from static weighting paradigms, our approach employs normalized dynamic weighting for lightweight, efficient adaptation. Experiments demonstrate statistically significant improvements over fixed-weight baselines across multiple metrics—including Recall@K and Mean Reciprocal Rank (MRR)—while maintaining high robustness and low inference overhead on medium- and small-scale models. The method enhances retrieval accuracy and end-to-end RAG performance without introducing substantial computational cost.
📝 Abstract
Hybrid retrieval techniques in Retrieval-Augmented Generation (RAG) systems enhance information retrieval by combining dense and sparse (e.g., BM25-based) retrieval methods. However, existing approaches struggle with adaptability, as fixed weighting schemes fail to adjust to different queries. To address this, we propose DAT (Dynamic Alpha Tuning), a novel hybrid retrieval framework that dynamically balances dense retrieval and BM25 for each query. DAT leverages a large language model (LLM) to evaluate the effectiveness of the top-1 results from both retrieval methods, assigning an effectiveness score to each. It then calibrates the optimal weighting factor through effectiveness score normalization, ensuring a more adaptive and query-aware weighting between the two approaches. Empirical results show that DAT consistently significantly outperforms fixed-weighting hybrid retrieval methods across various evaluation metrics. Even on smaller models, DAT delivers strong performance, highlighting its efficiency and adaptability.