Robust Joint Modeling for Data with Continuous and Binary Responses

📅 2026-03-12

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This study addresses the performance degradation of joint modeling with mixed continuous and binary responses under outliers or label noise. To this end, the authors propose a robust framework based on density power divergence (DPD) loss combined with ℓ₁ regularization. This work is the first to introduce DPD loss into mixed-response modeling, integrating a proximal gradient algorithm—enhanced with Barzilai–Borwein step sizes—and a robust information criterion (RIC) to simultaneously suppress anomalous samples, estimate high-dimensional sparse parameters, and adaptively select hyperparameters. Extensive simulations and real-world experiments on semiconductor wafer grinding data demonstrate that the proposed method significantly improves prediction accuracy, estimation fidelity, and model interpretability across various contamination scenarios.

Technology Category

Application Category

📝 Abstract

In many supervised learning applications, the response consists of both continuous and binary outcomes. Studies have shown that jointly modeling such mixed-type responses can substantially improve predictive performance compared to separate analyses. But outliers pose a new challenge to the existing likelihood-based modeling approaches. In this paper, we propose a new robust joint modeling framework for data with both continuous and binary responses. It is based on the density power divergence (DPD) loss function with the $\ell_1$ regularization. The proposed framework leads to a sparse estimator that simultaneously predicts continuous and binary responses in high-dimensional input settings while down-weighting influential outliers and mislabeled samples. We also develop an efficient proximal gradient algorithm with Barzilai-Borwein spectral step size and a robust information criterion (RIC) for data-driven selection of the penalty parameters. Extensive simulation studies under a variety of contamination schemes demonstrate that the proposed method achieves lower prediction error and more accurate parameter estimation than several competing approaches. A real case study on wafer lapping in semiconductor manufacturing further illustrates the practical gains in predictive accuracy, robustness, and interpretability of the proposed framework.

Problem

Research questions and friction points this paper is trying to address.

robust joint modeling

mixed-type responses

outliers

continuous and binary responses

high-dimensional data

Innovation

Methods, ideas, or system contributions that make the work stand out.

density power divergence

robust joint modeling

mixed-type responses