π€ AI Summary
This study addresses the challenges posed by fragmented and inconsistently formatted official clinical guidelines in Hong Kong, which hinder public access, and the limitations of general-purpose large language models that lack localized medical knowledge and are prone to factual errors. To overcome these issues, the authors propose a Dual-path Retrieval-Augmented Generation (DRAG) framework tailored for Hong Kongβs primary care setting. The framework integrates query optimization, hybrid-source retrieval, and contextual reorganization to effectively synthesize information from multiple guideline sources and enable precise citation. Experimental results demonstrate that the proposed approach significantly outperforms baseline and ablation models in answer accuracy, traceability, and clarity, offering a reliable and trustworthy conversational system architecture for high-stakes, localized healthcare applications.
π Abstract
To address the unsustainable rise in public health expenditures, the Hong Kong SAR Government is shifting its strategic focus to primary healthcare and encouraging citizens to use community resources to self-manage their health. However, official clinical guidelines are fragmented across disparate departments and formats, creating significant access barriers. While general-purpose Large Language Models (LLMs) such as ChatGPT and DeepSeek offer potential solutions for information accessibility, they are prone to generating factually inaccurate content due to a lack of localized and domain-specific knowledge. To this end, we propose a Retrieval-Augmented Generation-Enhanced LLM system as Primary Healthcare Assistant (PriHA) in Hong Kong. Specifically, a tri-stage pipeline is proposed that leverages a query optimizer to generalize user intent-oriented sub-queries, followed by a novel Dual Retrieval Augmented Generation (DRAG) architecture for mixed-source retrieval and context-reorganized generation. Comprehensive experiments and a detailed case study demonstrate that our proposed method can outperform both ablations and baseline in terms of accuracy and clarity. Our research provides a reliable and traceable dialogue retrieval framework for exploring other high-risk, localized application scenarios.