A Systematic Evaluation of Imbalance Handling Methods in Biomedical Binary Classification

📅 2026-05-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

216K/year
🤖 AI Summary
This study systematically evaluates the effectiveness of five class imbalance handling methods—Random Under-Sampling (RUS), Random Over-Sampling (ROS), SMOTE, reweighting, and direct F1 optimization—in multimodal biomedical binary classification tasks. For the first time, it investigates the interaction between model complexity and data modality (tabular, textual, and imaging) within a unified experimental framework. The evaluation spans diverse architectures, ranging from logistic regression and random forests to MLPs, BiLSTMs, BERT, DenseNet, and DINOv2. Results demonstrate that ROS and reweighting substantially enhance performance for complex models, while direct F1 optimization achieves the best results on unstructured data. In contrast, RUS and SMOTE generally degrade performance, revealing that the efficacy of imbalance mitigation strategies is highly contingent upon the interplay between model complexity and data modality.
📝 Abstract
Objective: The primary goal of this study was to systematically examine the impact of commonly used imbalance handling methods (IHMs) on predictive performance in biomedical binary classification, considering the interplay between model complexity and diverse data modalities. Material and Methods: We evaluated five representative IHMs: random undersampling (RUS), random oversampling (ROS), SMOTE, re-weighting (RW), and direct F1-score optimization (DMO), against a raw training (RAW) baseline. The evaluation encompassed three public biomedical datasets: MIMIC-III (tabular), ADE-Corpus-V2 (text), and MURA (image), spanning three common biomedical data modalities. To assess varying model complexity, we employed a range of architectures, from classical logistic regression and random forest to deep neural networks, including multilayer perceptron (MLP), BiLSTM, BERT, DenseNet, and DINOv2. Results: For simpler models such as logistic regression on tabular data, IHMs yielded no significant advantage over the RAW baseline, aligning with prior findings. However, clear benefits were observed for more complex models and unstructured data: (a) ROS and RW consistently enhanced the performance of powerful models; (b) direct F1-score optimization demonstrated utility primarily for unstructured text and image data; and (c) RUS and SMOTE consistently degraded performance and are therefore not recommended. Conclusion: The effectiveness of IHMs depends on both model complexity and data modality. Performance gains are most pronounced when leveraging appropriate IHMs, such as ROS, RW, and DMO, on high-complexity models.
Problem

Research questions and friction points this paper is trying to address.

class imbalance
biomedical classification
data modality
model complexity
binary classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

imbalance handling methods
model complexity
data modality
biomedical classification
systematic evaluation
🔎 Similar Papers
J
Jiandong Chen
Institute for Health Informatics, University of Minnesota, Minneapolis, MN
L
Lingjie Su
Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN
Le Peng
Le Peng
University of Minnesota
AI4HealthcareDeep LearningComputer VisionNatural Language Processing
Y
Yash Travadi
School of Statistics, University of Minnesota, Minneapolis, MN
R
Rui Zhang
Division of Computational Health Sciences, Department of Surgery, University of Minnesota, Minneapolis, MN
Ju Sun
Ju Sun
McKnight Land-Grant Professor, Computer Sci. & Eng., University of Minnesota at Twin Cities
machine learningcomputer visionnumerical optimizationAI for healthcareAI for science