An Exploratory Approach Towards Investigating and Explaining Vision Transformer and Transfer Learning for Brain Disease Detection

📅 2025-05-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses automated classification of brain MRI scans for neurological disorder diagnosis. We systematically evaluate Vision Transformer (ViT) against mainstream CNN-based transfer learning models—VGG16, VGG19, ResNet50V2, and MobileNetV2—on a Bangladeshi brain MRI dataset. Experimental results demonstrate that ViT achieves superior performance with 94.39% classification accuracy, significantly outperforming all comparative models. Methodologically, we introduce a unified framework integrating five Class Activation Mapping (CAM)-based eXplainable AI (XAI) techniques—GradCAM, GradCAM++, LayerCAM, ScoreCAM, and Faster-ScoreCAM—to enable architecture-agnostic, consistent lesion localization visualization. To our knowledge, this is the first work to empirically validate ViT’s diagnostic superiority on this geographically specific dataset. Our end-to-end pipeline jointly optimizes high predictive accuracy and clinical interpretability, thereby enhancing diagnostic trustworthiness and efficiency for radiologists.

Technology Category

Application Category

📝 Abstract
The brain is a highly complex organ that manages many important tasks, including movement, memory and thinking. Brain-related conditions, like tumors and degenerative disorders, can be hard to diagnose and treat. Magnetic Resonance Imaging (MRI) serves as a key tool for identifying these conditions, offering high-resolution images of brain structures. Despite this, interpreting MRI scans can be complicated. This study tackles this challenge by conducting a comparative analysis of Vision Transformer (ViT) and Transfer Learning (TL) models such as VGG16, VGG19, Resnet50V2, MobilenetV2 for classifying brain diseases using MRI data from Bangladesh based dataset. ViT, known for their ability to capture global relationships in images, are particularly effective for medical imaging tasks. Transfer learning helps to mitigate data constraints by fine-tuning pre-trained models. Furthermore, Explainable AI (XAI) methods such as GradCAM, GradCAM++, LayerCAM, ScoreCAM, and Faster-ScoreCAM are employed to interpret model predictions. The results demonstrate that ViT surpasses transfer learning models, achieving a classification accuracy of 94.39%. The integration of XAI methods enhances model transparency, offering crucial insights to aid medical professionals in diagnosing brain diseases with greater precision.
Problem

Research questions and friction points this paper is trying to address.

Compare Vision Transformer and Transfer Learning for brain disease classification
Improve MRI-based diagnosis using explainable AI methods
Enhance model accuracy and transparency for medical professionals
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Vision Transformer for brain disease classification
Applies Transfer Learning with pre-trained CNN models
Implements Explainable AI methods for prediction interpretation
🔎 Similar Papers
No similar papers found.