š¤ AI Summary
To address the high cost and limited scalability of clinical laboratory testingāparticularly for screening hundreds to thousands of diseasesāthis paper proposes CLDD, a Graph Neural Collaborative Learning model for disease detection. CLDD reformulates disease detection as an adaptive collaborative learning task, jointly modeling diseaseādisease associations and patientāpatient similarities, thereby eliminating reliance on disease-specific diagnostic tests. The model integrates heterogeneous featuresāincluding patientādisease interactions and demographic attributesāfrom electronic health records (EHRs) and employs a graph neural network for collaborative representation learning. Additionally, it incorporates an interpretable ranking mechanism to support clinical decision-making. Evaluated on the MIMIC-IV dataset (61,191 patients, 2,000 diseases), CLDD achieves absolute improvements of 6.33% in recall and 7.63% in precision over state-of-the-art baselines. It further demonstrates strong capability in recovering masked diseases and provides clinically meaningful, interpretable predictions.
š Abstract
Accurate disease detection is of paramount importance for effective medical treatment and patient care. However, the process of disease detection is often associated with extensive medical testing and considerable costs, making it impractical to perform all possible medical tests on a patient to diagnose or predict hundreds or thousands of diseases. In this work, we propose Collaborative Learning for Disease Detection (CLDD), a novel graph-based deep learning model that formulates disease detection as a collaborative learning task by exploiting associations among diseases and similarities among patients adaptively. CLDD integrates patient-disease interactions and demographic features from electronic health records to detect hundreds or thousands of diseases for every patient, with little to no reliance on the corresponding medical tests. Extensive experiments on a processed version of the MIMIC-IV dataset comprising 61,191 patients and 2,000 diseases demonstrate that CLDD consistently outperforms representative baselines across multiple metrics, achieving a 6.33% improvement in recall and 7.63% improvement in precision. Furthermore, case studies on individual patients illustrate that CLDD can successfully recover masked diseases within its top-ranked predictions, demonstrating both interpretability and reliability in disease prediction. By reducing diagnostic costs and improving accessibility, CLDD holds promise for large-scale disease screening and social health security.