Can Large Language Models Challenge CNNS in Medical Image Analysis?

📅 2025-05-29

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This study investigates whether large language models (LLMs) can challenge the dominance of convolutional neural networks (CNNs) in medical image classification. Method: We construct a multimodal AI framework using public medical imaging datasets and systematically compare CNNs against multiple LLMs enhanced with image feature filtering—evaluating diagnostic accuracy (accuracy, F1-score), inference efficiency, and carbon footprint (energy consumption and CO₂ emissions). Contribution/Results: We present the first empirical evidence that image feature filtering significantly boosts LLM diagnostic performance, enabling accuracy comparable to CNNs. Crucially, LLM-based pipelines reduce inference energy consumption and associated CO₂ emissions by over 60% on average, achieving superior energy efficiency and scalability potential for clinical deployment. These findings establish a novel, holistic evaluation paradigm for medical AI that jointly optimizes diagnostic accuracy, computational efficiency, and environmental sustainability.

Technology Category

Application Category

📝 Abstract

This study presents a multimodal AI framework designed for precisely classifying medical diagnostic images. Utilizing publicly available datasets, the proposed system compares the strengths of convolutional neural networks (CNNs) and different large language models (LLMs). This in-depth comparative analysis highlights key differences in diagnostic performance, execution efficiency, and environmental impacts. Model evaluation was based on accuracy, F1-score, average execution time, average energy consumption, and estimated $CO_2$ emission. The findings indicate that although CNN-based models can outperform various multimodal techniques that incorporate both images and contextual information, applying additional filtering on top of LLMs can lead to substantial performance gains. These findings highlight the transformative potential of multimodal AI systems to enhance the reliability, efficiency, and scalability of medical diagnostics in clinical settings.

Problem

Research questions and friction points this paper is trying to address.

Comparing CNNs and LLMs in medical image classification

Evaluating diagnostic performance, efficiency, and environmental impact

Exploring multimodal AI to enhance medical diagnostics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal AI framework for medical image classification

Compares CNNs and LLMs in diagnostic performance

LLMs with filtering enhance diagnostic performance

🔎 Similar Papers

No similar papers found.