LLM-Driven Medical Document Analysis: Enhancing Trustworthy Pathology and Differential Diagnosis

📅 2025-06-24

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

To address the challenge that sensitive medical data cannot be uploaded to cloud platforms—thereby impeding the clinical deployment of large language models (LLMs) for differential diagnosis—this paper proposes a privacy-preserving, interpretable, on-device medical document analysis framework. Built upon LLaMA-v3, the method employs Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning and integrates state-of-the-art interpretability techniques. Evaluated on the DDXPlus dataset, it supports variable-length diagnostic outputs and jointly performs pathology prediction and multi-candidate differential diagnosis generation. Our approach achieves statistically significant accuracy improvements over existing SOTA models on both tasks. The entire codebase is open-sourced, accompanied by a lightweight web interface, ensuring clinical usability and broad deployability across resource-constrained environments. This work overcomes a critical bottleneck in deploying LLMs within real-world healthcare settings under stringent privacy constraints.

Technology Category

Application Category

📝 Abstract

Medical document analysis plays a crucial role in extracting essential clinical insights from unstructured healthcare records, supporting critical tasks such as differential diagnosis. Determining the most probable condition among overlapping symptoms requires precise evaluation and deep medical expertise. While recent advancements in large language models (LLMs) have significantly enhanced performance in medical document analysis, privacy concerns related to sensitive patient data limit the use of online LLMs services in clinical settings. To address these challenges, we propose a trustworthy medical document analysis platform that fine-tunes a LLaMA-v3 using low-rank adaptation, specifically optimized for differential diagnosis tasks. Our approach utilizes DDXPlus, the largest benchmark dataset for differential diagnosis, and demonstrates superior performance in pathology prediction and variable-length differential diagnosis compared to existing methods. The developed web-based platform allows users to submit their own unstructured medical documents and receive accurate, explainable diagnostic results. By incorporating advanced explainability techniques, the system ensures transparent and reliable predictions, fostering user trust and confidence. Extensive evaluations confirm that the proposed method surpasses current state-of-the-art models in predictive accuracy while offering practical utility in clinical settings. This work addresses the urgent need for reliable, explainable, and privacy-preserving artificial intelligence solutions, representing a significant advancement in intelligent medical document analysis for real-world healthcare applications. The code can be found at href{https://github.com/leitro/Differential-Diagnosis-LoRA}{https://github.com/leitro/Differential-Diagnosis-LoRA}.

Problem

Research questions and friction points this paper is trying to address.

Enhancing trustworthy differential diagnosis using LLMs

Addressing privacy concerns in medical document analysis

Improving accuracy and explainability in pathology prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tunes LLaMA-v3 using low-rank adaptation

Utilizes DDXPlus benchmark dataset for diagnosis

Web platform for explainable diagnostic results

🔎 Similar Papers

Large Language Models for Disease Diagnosis: A Scoping Review