🤖 AI Summary
Accurately matching patients to clinical trials remains challenging due to semantic heterogeneity and structural complexity in electronic health records (EHRs) and eligibility criteria.
Method: We propose a large language model (LLM)-driven retrieval-augmented generation (RAG) framework that jointly models multi-source EHR semantics and structured trial inclusion/exclusion criteria. Our approach integrates fine-tuned open-weight LLMs, structured prompt engineering, and optimized classification heads to deliver end-to-end, interpretable, and generalizable matching.
Contribution/Results: This work introduces the first open-LLM–powered RAG paradigm for clinical trial matching—uniquely balancing logical traceability with cross-dataset generalization. Evaluated on four established benchmarks (n2c2, SIGIR, TREC 2021, TREC 2022), our method significantly outperforms TrialGPT, zero-shot baselines, and the closed-source GPT-4, demonstrating both state-of-the-art performance and practical viability.
📝 Abstract
Patient matching is the process of linking patients to appropriate clinical trials by accurately identifying and matching their medical records with trial eligibility criteria. We propose LLM-Match, a novel framework for patient matching leveraging fine-tuned open-source large language models. Our approach consists of four key components. First, a retrieval-augmented generation (RAG) module extracts relevant patient context from a vast pool of electronic health records (EHRs). Second, a prompt generation module constructs input prompts by integrating trial eligibility criteria (both inclusion and exclusion criteria), patient context, and system instructions. Third, a fine-tuning module with a classification head optimizes the model parameters using structured prompts and ground-truth labels. Fourth, an evaluation module assesses the fine-tuned model's performance on the testing datasets. We evaluated LLM-Match on four open datasets - n2c2, SIGIR, TREC 2021, and TREC 2022 - using open-source models, comparing it against TrialGPT, Zero-Shot, and GPT-4-based closed models. LLM-Match outperformed all baselines.