Predictive Analytics for Dementia: Machine Learning on Healthcare Data

📅 2026-01-12

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This study addresses the challenges of delayed clinical diagnosis and class imbalance in early dementia prediction by proposing a supervised learning framework that integrates structured health records with unstructured clinical text. To mitigate data imbalance, the authors employ SMOTE oversampling, while textual features are processed using TF-IDF vectorization. The performance of several classifiers—including K-nearest neighbors (KNN), quadratic discriminant analysis (QDA), Gaussian processes, and linear discriminant analysis (LDA)—is systematically evaluated. Among these, the LDA model achieves 98% accuracy on the test set, substantially outperforming baseline approaches. Furthermore, interpretability analyses reveal strong associations between dementia risk and factors such as the APOE-ε4 allele and comorbid chronic conditions like diabetes, demonstrating the dual advantage of the proposed pipeline in both predictive performance and clinically meaningful insights.

Technology Category

Application Category

📝 Abstract

Dementia is a complex syndrome impacting cognitive and emotional functions, with Alzheimer's disease being the most common form. This study focuses on enhancing dementia prediction using machine learning (ML) techniques on patient health data. Supervised learning algorithms are applied in this study, including K-Nearest Neighbors (KNN), Quadratic Discriminant Analysis (QDA), Linear Discriminant Analysis (LDA), and Gaussian Process Classifiers. To address class imbalance and improve model performance, techniques such as Synthetic Minority Over-sampling Technique (SMOTE) and Term Frequency-Inverse Document Frequency (TF-IDF) vectorization were employed. Among the models, LDA achieved the highest testing accuracy of 98%. This study highlights the importance of model interpretability and the correlation of dementia with features such as the presence of the APOE-epsilon4 allele and chronic conditions like diabetes. This research advocates for future ML innovations, particularly in integrating explainable AI approaches, to further improve predictive capabilities in dementia care.

Problem

Research questions and friction points this paper is trying to address.

dementia

predictive analytics

machine learning

healthcare data

Alzheimer's disease

Innovation

Methods, ideas, or system contributions that make the work stand out.

machine learning

dementia prediction

SMOTE