🤖 AI Summary
Patient recruitment for clinical trials is hindered by complex eligibility criteria and inefficient manual review of electronic health records (EHRs). Existing text-only models suffer from weak reasoning capabilities, information loss during medical image-to-text conversion, and heavy dependence on deep EHR system integration. To address these challenges, we propose a plug-and-play multimodal matching framework that integrates a reasoning-enhanced large language model with a vision understanding module, enabling end-to-end joint parsing of unstructured clinical text and medical images directly from EHRs. We further design a medical record–oriented multimodal embedding retrieval mechanism supporting zero-preprocessing document parsing. Evaluated on the n2c2 dataset, our approach achieves 93% criterion-level accuracy, 87% real-world matching accuracy, and ≤9 minutes per case—80% faster than manual review. Our key contribution is the first fully automated, EHR-agnostic multimodal patient–trial matching paradigm requiring no system customization or interface modification.
📝 Abstract
Background: Patient recruitment in clinical trials is hindered by complex eligibility criteria and labor-intensive chart reviews. Prior research using text-only models have struggled to address this problem in a reliable and scalable way due to (1) limited reasoning capabilities, (2) information loss from converting visual records to text, and (3) lack of a generic EHR integration to extract patient data. Methods: We introduce a broadly applicable, integration-free, LLM-powered pipeline that automates patient-trial matching using unprocessed documents extracted from EHRs. Our approach leverages (1) the new reasoning-LLM paradigm, enabling the assessment of even the most complex criteria, (2) visual capabilities of latest LLMs to interpret medical records without lossy image-to-text conversions, and (3) multimodal embeddings for efficient medical record search. The pipeline was validated on the n2c2 2018 cohort selection dataset (288 diabetic patients) and a real-world dataset composed of 485 patients from 30 different sites matched against 36 diverse trials. Results: On the n2c2 dataset, our method achieved a new state-of-the-art criterion-level accuracy of 93%. In real-world trials, the pipeline yielded an accuracy of 87%, undermined by the difficulty to replicate human decision-making when medical records lack sufficient information. Nevertheless, users were able to review overall eligibility in under 9 minutes per patient on average, representing an 80% improvement over traditional manual chart reviews. Conclusion: This pipeline demonstrates robust performance in clinical trial patient matching without requiring custom integration with site systems or trial-specific tailoring, thereby enabling scalable deployment across sites seeking to leverage AI for patient matching.