LLM-Bootstrapped Targeted Finding Guidance for Factual MLLM-based Medical Report Generation

📅 2026-02-27

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This work addresses the challenge of factual inconsistency—such as lesion omission or hallucination—that hinders the clinical deployment of multimodal large language models (MLLMs) in medical report generation. To mitigate this issue, the authors propose Fact-Flow, a framework that decouples visual fact extraction from text generation: it first identifies structured clinical findings from medical images and then uses these findings to guide the MLLM in producing factually accurate reports. Notably, the framework leverages a large language model to automatically construct a labeled dataset of medical findings, circumventing the need for costly manual annotation. Evaluated on two disease-specific datasets, Fact-Flow significantly improves factual accuracy while maintaining high-quality narrative generation, outperforming current state-of-the-art methods.

Technology Category

Application Category

📝 Abstract

The automatic generation of medical reports utilizing Multimodal Large Language Models (MLLMs) frequently encounters challenges related to factual instability, which may manifest as the omission of findings or the incorporation of inaccurate information, thereby constraining their applicability in clinical settings. Current methodologies typically produce reports based directly on image features, which inherently lack a definitive factual basis. In response to this limitation, we introduce Fact-Flow, an innovative framework that separates the process of visual fact identification from the generation of reports. This is achieved by initially predicting clinical findings from the image, which subsequently directs the MLLM to produce a report that is factually precise. A pivotal advancement of our approach is a pipeline that leverages a Large Language Model (LLM) to autonomously create a dataset of labeled medical findings, effectively eliminating the need for expensive manual annotation. Extensive experimental evaluations conducted on two disease-focused medical datasets validate the efficacy of our method, demonstrating a significant enhancement in factual accuracy compared to state-of-the-art models, while concurrently preserving high standards of text quality.

Problem

Research questions and friction points this paper is trying to address.

factual instability

medical report generation

Multimodal Large Language Models

clinical findings

factually inaccurate information

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fact-Flow

factual accuracy

LLM-bootstrapped annotation