A Systematic Analysis of Declining Medical Safety Messaging in Generative AI Models

📅 2025-07-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study identifies a concerning trend of significantly weakened safety disclaimers in generative AI—specifically large language models (LLMs) and vision-language models (VLMs)—during medical image interpretation and clinical question answering. From 2022 to 2025, disclaimer rates plummeted from 26.3% to 0.97% in LLM outputs and from 19.6% to 1.05% in VLM outputs. Method: Using a curated multimodal dataset of 500 medical images (mammography, chest X-ray, dermoscopy) and 500 clinically grounded questions, we employed automated keyword screening and content analysis to systematically quantify disclaimer prevalence for the first time. Contribution/Results: We demonstrate that increasing model authority correlates with declining transparency and accountability, posing tangible clinical risks. To address this, we propose a novel “clinical-context-aware dynamic disclaimer embedding” framework that adaptively integrates context-sensitive disclaimers into model outputs. Our findings provide empirical evidence and methodological guidance for regulatory policy development and safety-conscious AI design in healthcare.

Technology Category

Application Category

📝 Abstract
Generative AI models, including large language models (LLMs) and vision-language models (VLMs), are increasingly used to interpret medical images and answer clinical questions. Their responses often include inaccuracies; therefore, safety measures like medical disclaimers are critical to remind users that AI outputs are not professionally vetted or a substitute for medical advice. This study evaluated the presence of disclaimers in LLM and VLM outputs across model generations from 2022 to 2025. Using 500 mammograms, 500 chest X-rays, 500 dermatology images, and 500 medical questions, outputs were screened for disclaimer phrases. Medical disclaimer presence in LLM and VLM outputs dropped from 26.3% in 2022 to 0.97% in 2025, and from 19.6% in 2023 to 1.05% in 2025, respectively. By 2025, the majority of models displayed no disclaimers. As public models become more capable and authoritative, disclaimers must be implemented as a safeguard adapting to the clinical context of each output.
Problem

Research questions and friction points this paper is trying to address.

Evaluates declining medical disclaimers in AI outputs
Assesses safety risks in generative medical AI models
Analyzes disclaimer trends from 2022 to 2025
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluated disclaimer presence in AI outputs
Used diverse medical images and questions
Proposed adaptive disclaimers for clinical context
🔎 Similar Papers
No similar papers found.
S
Sonali Sharma
Department of Medicine, University of British Columbia, Vancouver, BC, Canada Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA
Ahmed M. Alaa
Ahmed M. Alaa
Assistant Professor, UC Berkeley and UCSF
Machine LearningArtificial IntelligenceCausal InferenceAI for MedicineHealthcare
R
Roxana Daneshjou
Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA Department of Dermatology, Stanford School of Medicine, Stanford, CA, USA