🤖 AI Summary
This study addresses the performance degradation of joint modeling with mixed continuous and binary responses under outliers or label noise. To this end, the authors propose a robust framework based on density power divergence (DPD) loss combined with ℓ₁ regularization. This work is the first to introduce DPD loss into mixed-response modeling, integrating a proximal gradient algorithm—enhanced with Barzilai–Borwein step sizes—and a robust information criterion (RIC) to simultaneously suppress anomalous samples, estimate high-dimensional sparse parameters, and adaptively select hyperparameters. Extensive simulations and real-world experiments on semiconductor wafer grinding data demonstrate that the proposed method significantly improves prediction accuracy, estimation fidelity, and model interpretability across various contamination scenarios.
📝 Abstract
In many supervised learning applications, the response consists of both continuous and binary outcomes. Studies have shown that jointly modeling such mixed-type responses can substantially improve predictive performance compared to separate analyses. But outliers pose a new challenge to the existing likelihood-based modeling approaches. In this paper, we propose a new robust joint modeling framework for data with both continuous and binary responses. It is based on the density power divergence (DPD) loss function with the $\ell_1$ regularization. The proposed framework leads to a sparse estimator that simultaneously predicts continuous and binary responses in high-dimensional input settings while down-weighting influential outliers and mislabeled samples. We also develop an efficient proximal gradient algorithm with Barzilai-Borwein spectral step size and a robust information criterion (RIC) for data-driven selection of the penalty parameters. Extensive simulation studies under a variety of contamination schemes demonstrate that the proposed method achieves lower prediction error and more accurate parameter estimation than several competing approaches. A real case study on wafer lapping in semiconductor manufacturing further illustrates the practical gains in predictive accuracy, robustness, and interpretability of the proposed framework.