🤖 AI Summary
Low patient recruitment efficiency in clinical trials hinders study feasibility and generalizability.
Method: We propose a novel eligibility matching paradigm that takes patients’ natural language descriptions as input, introducing NLI4PR—the first patient-facing natural language inference task for semantically aligning colloquial health statements with structured eligibility criteria. Using manually reconstructed patient-language samples derived from TREC 2022 data, we conduct zero-shot and few-shot inference experiments across multiple open-source large language models.
Contribution/Results: The best-performing model achieves 71.8 F1 on patient language—only 1.3 points below its performance on expert medical language—providing the first empirical validation of patients’ ability to autonomously perform preliminary eligibility screening with robust accuracy. This work shifts recruitment from clinician-centric to patient-empowered paradigms, and we publicly release all data and code to support decentralized, patient-inclusive trial enrollment methodologies.
📝 Abstract
Recruiting patients to participate in clinical trials can be challenging and time-consuming. Usually, participation in a clinical trial is initiated by a healthcare professional and proposed to the patient. Promoting clinical trials directly to patients via online recruitment might help to reach them more efficiently. In this study, we address the case where a patient is initiating their own recruitment process and wants to determine whether they are eligible for a given clinical trial, using their own language to describe their medical profile. To study whether this creates difficulties in the patient trial matching process, we design a new dataset and task, Natural Language Inference for Patient Recruitment (NLI4PR), in which patient language profiles must be matched to clinical trials. We create it by adapting the TREC 2022 Clinical Trial Track dataset, which provides patients' medical profiles, and rephrasing them manually using patient language. We also use the associated clinical trial reports where the patients are either eligible or excluded. We prompt several open-source Large Language Models on our task and achieve from 56.5 to 71.8 of F1 score using patient language, against 64.7 to 73.1 for the same task using medical language. When using patient language, we observe only a small loss in performance for the best model, suggesting that having the patient as a starting point could be adopted to help recruit patients for clinical trials. The corpus and code bases are all freely available on our Github and HuggingFace repositories.