Analysis of Blood Report Images Using General Purpose Vision-Language Models

📅 2025-09-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Public misinterpretation of blood test reports frequently triggers health anxiety and suboptimal clinical decision-making. Method: This study presents the first systematic evaluation of general-purpose vision-language models (VLMs)—including Qwen-VL-Max, Gemini 2.5 Pro, and Llama 4 Maverick—on blood report image understanding. We constructed a standardized dataset of blood report image–question pairs and employed Sentence-BERT–based semantic similarity scoring for automated, objective assessment. Contribution/Results: Results demonstrate that state-of-the-art VLMs exhibit robust clinical-relevance in question answering, particularly excelling in clarity and readability of result explanations. Several models achieve practical utility for patient-facing interpretation of critical laboratory parameters. This work empirically validates the feasibility of deploying general-purpose VLMs in lightweight, interpretable medical explanation tasks, establishing both an evidence base and a methodological framework for developing low-cost, scalable AI tools to empower public health literacy.

Technology Category

Application Category

📝 Abstract
The reliable analysis of blood reports is important for health knowledge, but individuals often struggle with interpretation, leading to anxiety and overlooked issues. We explore the potential of general-purpose Vision-Language Models (VLMs) to address this challenge by automatically analyzing blood report images. We conduct a comparative evaluation of three VLMs: Qwen-VL-Max, Gemini 2.5 Pro, and Llama 4 Maverick, determining their performance on a dataset of 100 diverse blood report images. Each model was prompted with clinically relevant questions adapted to each blood report. The answers were then processed using Sentence-BERT to compare and evaluate how closely the models responded. The findings suggest that general-purpose VLMs are a practical and promising technology for developing patient-facing tools for preliminary blood report analysis. Their ability to provide clear interpretations directly from images can improve health literacy and reduce the limitations to understanding complex medical information. This work establishes a foundation for the future development of reliable and accessible AI-assisted healthcare applications. While results are encouraging, they should be interpreted cautiously given the limited dataset size.
Problem

Research questions and friction points this paper is trying to address.

Automating blood report image interpretation for patients
Evaluating general-purpose VLMs on clinical question answering
Reducing health literacy barriers through AI analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

General-purpose VLMs analyze blood report images
Comparative evaluation of three VLMs using clinical questions
Sentence-BERT processes answers for automated medical interpretation
🔎 Similar Papers
No similar papers found.
N
Nadia Bakhsheshi
dept. Electrical Engineering, Sharif University of Technology, Tehran, Iran
Hamid Beigy
Hamid Beigy
Sharif University of Technology
Deep LearningMachine LearningData Stream MiningInformation RetrievalNatural Language Processing