🤖 AI Summary
In high-stakes domains such as healthcare, existing machine learning models often neglect predictive uncertainty, compromising reliability—especially for low-confidence predictions. To address this, we propose a two-stage data-driven framework: (1) during training, hard instances are filtered using instance hardness to improve model robustness; (2) during inference, unreliable predictions are rejected based on a calibrated confidence threshold. Our approach innovatively integrates difficulty-aware learning with a confidence-driven rejection mechanism—distinct from conventional uncertainty estimation or influence analysis methods. Experiments on three real-world clinical datasets demonstrate that our method achieves superior trade-offs between predictive accuracy and rejection rate: it preserves the majority of informative samples while significantly reducing misclassification risk. Crucially, it maintains high accuracy without sacrificing robustness or trustworthiness, thereby exhibiting strong potential for clinical deployment.
📝 Abstract
Machine Learning (ML) models are widely used in high-stakes domains such as healthcare, where the reliability of predictions is critical. However, these models often fail to account for uncertainty, providing predictions even with low confidence. This work proposes a novel two-step data-centric approach to enhance the performance of ML models by improving data quality and filtering low-confidence predictions. The first step involves leveraging Instance Hardness (IH) to filter problematic instances during training, thereby refining the dataset. The second step introduces a confidence-based rejection mechanism during inference, ensuring that only reliable predictions are retained. We evaluate our approach using three real-world healthcare datasets, demonstrating its effectiveness at improving model reliability while balancing predictive performance and rejection rate. Additionally, we use alternative criteria - influence values for filtering and uncertainty for rejection - as baselines to evaluate the efficiency of the proposed method. The results demonstrate that integrating IH filtering with confidence-based rejection effectively enhances model performance while preserving a large proportion of instances. This approach provides a practical method for deploying ML systems in safety-critical applications.