High-Throughput Phenotyping of Clinical Text Using Large Language Models

📅 2024-08-02
🏛️ 2024 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI)
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of automatically mapping patient phenotypic manifestations in clinical text to the standardized Human Phenotype Ontology (HPO). Methodologically, it presents the first systematic evaluation of GPT-4’s end-to-end performance—spanning symptom identification, classification, and HPO standardization—on OMIM clinical summaries, and introduces an automated pipeline integrating retrieval-augmented generation (RAG), API-coordinated LLM orchestration, and HPO ontology alignment. Results show that symptom identification and classification accuracy matches inter-annotator agreement levels; however, HPO ID recall remains suboptimal and warrants further refinement. The pipeline significantly enhances analytical throughput and scalability. The core contribution is the empirical validation of LLM-driven phenotypic standardization feasibility, coupled with a reusable, API-coordinated LLM engineering framework. This work establishes a scalable computational phenotyping infrastructure for precision medicine.

Technology Category

Application Category

📝 Abstract
High-throughput phenotyping automates the mapping of patient signs to standardized concepts, such as those in Human Phenotype Ontology (HPO), a process critical to precision medicine. We evaluated the automated phenotyping of clinical summaries from the Online Mendelian Inheritance in Man (OMIM) database using a large language model. Various APIs were used to automate text retrieval, sign identification, categorization, and normalization. GPT-4 outperformed GPT-3.5Turbo in identifying, categorizing, and normalizing signs, achieving concordance with manual annotators comparable to concordance between manual annotators. While GPT-4 demonstrates high accuracy in sign identification and categorization, limitations remain in sign normalization, particularly in retrieving the correct HPO ID for a normalized term. Methods such as retrieval-augmented generation, changes in pre-training, and additional fine-tuning may help address these limitations. The combination of APIs with large language models presents a promising approach for high-throughput phenotyping of free text.
Problem

Research questions and friction points this paper is trying to address.

Automating clinical text phenotyping using large language models
Comparing GPT-4 and GPT-3.5-Turbo for phenotype identification
Enhancing precision medicine through standardized ontology mapping
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses GPT-4 for clinical text phenotyping
Compares GPT-4 and GPT-3.5-Turbo performance
Eliminates need for manual training data
🔎 Similar Papers
No similar papers found.
D
D. B. Hier
University of Illinois at Chicago, Chicago IL 60612
S
S. I. Munzir
University of Illinois at Chicago, Chicago IL 60612
A
Anne Stahlfeld
University of Illinois at Chicago, Chicago IL 60612
Tayo Obafemi-Ajayi
Tayo Obafemi-Ajayi
Missouri State University
Machine learningdata miningbioinformaticsintelligent systems
M
M. Carrithers
University of Illinois at Chicago, Chicago IL 60612