High-Throughput Phenotyping of Clinical Text Using Large Language Models

📅 2024-08-02

🏛️ 2024 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI)

📈 Citations: 1

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This study addresses the challenge of automatically mapping patient phenotypic manifestations in clinical text to the standardized Human Phenotype Ontology (HPO). Methodologically, it presents the first systematic evaluation of GPT-4’s end-to-end performance—spanning symptom identification, classification, and HPO standardization—on OMIM clinical summaries, and introduces an automated pipeline integrating retrieval-augmented generation (RAG), API-coordinated LLM orchestration, and HPO ontology alignment. Results show that symptom identification and classification accuracy matches inter-annotator agreement levels; however, HPO ID recall remains suboptimal and warrants further refinement. The pipeline significantly enhances analytical throughput and scalability. The core contribution is the empirical validation of LLM-driven phenotypic standardization feasibility, coupled with a reusable, API-coordinated LLM engineering framework. This work establishes a scalable computational phenotyping infrastructure for precision medicine.

Technology Category

Application Category

📝 Abstract

High-throughput phenotyping automates the mapping of patient signs to standardized concepts, such as those in Human Phenotype Ontology (HPO), a process critical to precision medicine. We evaluated the automated phenotyping of clinical summaries from the Online Mendelian Inheritance in Man (OMIM) database using a large language model. Various APIs were used to automate text retrieval, sign identification, categorization, and normalization. GPT-4 outperformed GPT-3.5Turbo in identifying, categorizing, and normalizing signs, achieving concordance with manual annotators comparable to concordance between manual annotators. While GPT-4 demonstrates high accuracy in sign identification and categorization, limitations remain in sign normalization, particularly in retrieving the correct HPO ID for a normalized term. Methods such as retrieval-augmented generation, changes in pre-training, and additional fine-tuning may help address these limitations. The combination of APIs with large language models presents a promising approach for high-throughput phenotyping of free text.

Problem

Research questions and friction points this paper is trying to address.

Automating clinical text phenotyping using large language models

Comparing GPT-4 and GPT-3.5-Turbo for phenotype identification

Enhancing precision medicine through standardized ontology mapping

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses GPT-4 for clinical text phenotyping

Compares GPT-4 and GPT-3.5-Turbo performance

Eliminates need for manual training data

🔎 Similar Papers

No similar papers found.