Filtering instances and rejecting predictions to obtain reliable models in healthcare

📅 2025-10-28

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

In high-stakes domains such as healthcare, existing machine learning models often neglect predictive uncertainty, compromising reliability—especially for low-confidence predictions. To address this, we propose a two-stage data-driven framework: (1) during training, hard instances are filtered using instance hardness to improve model robustness; (2) during inference, unreliable predictions are rejected based on a calibrated confidence threshold. Our approach innovatively integrates difficulty-aware learning with a confidence-driven rejection mechanism—distinct from conventional uncertainty estimation or influence analysis methods. Experiments on three real-world clinical datasets demonstrate that our method achieves superior trade-offs between predictive accuracy and rejection rate: it preserves the majority of informative samples while significantly reducing misclassification risk. Crucially, it maintains high accuracy without sacrificing robustness or trustworthiness, thereby exhibiting strong potential for clinical deployment.

Technology Category

Application Category

📝 Abstract

Machine Learning (ML) models are widely used in high-stakes domains such as healthcare, where the reliability of predictions is critical. However, these models often fail to account for uncertainty, providing predictions even with low confidence. This work proposes a novel two-step data-centric approach to enhance the performance of ML models by improving data quality and filtering low-confidence predictions. The first step involves leveraging Instance Hardness (IH) to filter problematic instances during training, thereby refining the dataset. The second step introduces a confidence-based rejection mechanism during inference, ensuring that only reliable predictions are retained. We evaluate our approach using three real-world healthcare datasets, demonstrating its effectiveness at improving model reliability while balancing predictive performance and rejection rate. Additionally, we use alternative criteria - influence values for filtering and uncertainty for rejection - as baselines to evaluate the efficiency of the proposed method. The results demonstrate that integrating IH filtering with confidence-based rejection effectively enhances model performance while preserving a large proportion of instances. This approach provides a practical method for deploying ML systems in safety-critical applications.

Problem

Research questions and friction points this paper is trying to address.

Enhancing ML model reliability in healthcare through data quality improvement

Filtering problematic training instances using instance hardness analysis

Implementing confidence-based rejection for low-quality predictions during inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Filtering instances using Instance Hardness during training

Rejecting low-confidence predictions during inference

Enhancing model reliability with data-centric approach

🔎 Similar Papers

No similar papers found.