Proof-of-TBI -- Fine-Tuned Vision Language Model Consortium and OpenAI-o3 Reasoning LLM-Based Medical Diagnosis Support System for Mild Traumatic Brain Injury (TBI) Prediction

📅 2025-04-25

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Mild traumatic brain injury (mTBI) poses significant clinical challenges due to its radiologically subtle manifestations and low diagnostic accuracy. Method: This study introduces the first intelligent diagnostic system integrating fine-tuned multimodal vision-language models (e.g., ViLT, BLIP-2) with the OpenAI-o3 reasoning large language model (LLM). We propose an LLM Agent–driven, interpretable decision framework leveraging customized prompt engineering, multi-model consensus aggregation, and dynamic task orchestration to enable end-to-end MRI analysis and diagnostic reasoning. Contribution/Results: To our knowledge, this is the first work synergizing a coalition of vision-language models with a reasoning-oriented LLM for mTBI prediction, establishing a clinically interpretable, automated diagnostic paradigm. Evaluated on real-world MRI data from military physicians, the system significantly improves robustness in detecting occult lesions and enhances diagnostic confidence. A prototype has been successfully deployed with the U.S. Army Medical Research Team.

Technology Category

Application Category

📝 Abstract

Mild Traumatic Brain Injury (TBI) detection presents significant challenges due to the subtle and often ambiguous presentation of symptoms in medical imaging, making accurate diagnosis a complex task. To address these challenges, we propose Proof-of-TBI, a medical diagnosis support system that integrates multiple fine-tuned vision-language models with the OpenAI-o3 reasoning large language model (LLM). Our approach fine-tunes multiple vision-language models using a labeled dataset of TBI MRI scans, training them to diagnose TBI symptoms effectively. The predictions from these models are aggregated through a consensus-based decision-making process. The system evaluates the predictions from all fine-tuned vision language models using the OpenAI-o3 reasoning LLM, a model that has demonstrated remarkable reasoning performance, to produce the most accurate final diagnosis. The LLM Agents orchestrates interactions between the vision-language models and the reasoning LLM, managing the final decision-making process with transparency, reliability, and automation. This end-to-end decision-making workflow combines the vision-language model consortium with the OpenAI-o3 reasoning LLM, enabled by custom prompt engineering by the LLM agents. The prototype for the proposed platform was developed in collaboration with the U.S. Army Medical Research team in Newport News, Virginia, incorporating five fine-tuned vision-language models. The results demonstrate the transformative potential of combining fine-tuned vision-language model inputs with the OpenAI-o3 reasoning LLM to create a robust, secure, and highly accurate diagnostic system for mild TBI prediction. To the best of our knowledge, this research represents the first application of fine-tuned vision-language models integrated with a reasoning LLM for TBI prediction tasks.

Problem

Research questions and friction points this paper is trying to address.

Detecting mild TBI is challenging due to subtle symptoms in medical imaging.

Integrating vision-language models with LLM improves TBI diagnosis accuracy.

Combining fine-tuned models and reasoning LLM creates robust TBI prediction system.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned vision-language models for TBI diagnosis

OpenAI-o3 reasoning LLM for decision aggregation

LLM agents manage transparent automated workflow

🔎 Similar Papers

No similar papers found.