🤖 AI Summary
A systematic review of large language models (LLMs) in disease diagnosis remains absent. To address this gap, we conduct a bibliometric and thematic analysis integrating natural language processing, clinical informatics, and evidence-based medicine evaluation frameworks. Our study establishes, for the first time, a multidimensional analytical framework encompassing disease categories, clinical data modalities, model architectures, and evaluation methodologies—covering 52 diseases across 12 clinical specialties. We identify adaptation patterns and evaluation biases of mainstream LLMs (e.g., GPT, LLaMA), and uncover two critical bottlenecks: insufficient reproducibility and lack of rigorous clinical validation. Based on these findings, we propose standardized evaluation criteria and a phased clinical implementation pathway. This work fills a significant void in high-quality, integrative reviews on LLMs in diagnostic applications, offering actionable insights for researchers and clinicians alike.
📝 Abstract
Automatic disease diagnosis has become increasingly valuable in clinical practice. The advent of large language models (LLMs) has catalyzed a paradigm shift in artificial intelligence, with growing evidence supporting the efficacy of LLMs in diagnostic tasks. Despite the increasing attention in this field, a holistic view is still lacking. Many critical aspects remain unclear, such as the diseases and clinical data to which LLMs have been applied, the LLM techniques employed, and the evaluation methods used. In this article, we perform a comprehensive review of LLM-based methods for disease diagnosis. Our review examines the existing literature across various dimensions, including disease types and associated clinical specialties, clinical data, LLM techniques, and evaluation methods. Additionally, we offer recommendations for applying and evaluating LLMs for diagnostic tasks. Furthermore, we assess the limitations of current research and discuss future directions. To our knowledge, this is the first comprehensive review for LLM-based disease diagnosis.