🤖 AI Summary
This study addresses the challenge of variable selection in binary classification when the response variable is subject to misclassification (measurement error). By leveraging auxiliary validation data, the authors propose, for the first time, both parametric and semiparametric penalized estimation methods that possess the oracle property. Through carefully designed penalty functions and regularization parameters, the proposed approaches effectively correct for misclassification bias while achieving consistent variable selection. Theoretical analysis rigorously establishes the oracle property of the resulting estimators, confirming their asymptotic optimality. Extensive numerical experiments further demonstrate the superior finite-sample performance of the methods compared to existing alternatives, highlighting their practical utility in settings with error-prone outcome measurements.
📝 Abstract
While variable selection has received extensive attention in the literature, its exploration in the presence of response measurement error remains underexplored. In this paper, we investigate this important problem within the context of binary classification with error-prone responses. We present valid variable selection procedures to address the complexities of response errors. Leveraging validation data, we introduce both parametric and semiparametric methodologies to accommodate the mismeasurement effects. By rigorously establishing theoretical results, we offer insights and justifications of the validity of the proposed methods. By properly choosing {the} penalty function and regularization parameter, we demonstrate that the resulting estimators possess the oracle property. To assess the finite sample properties of the proposed methods, we conduct numerical studies that confirm the effectiveness of our proposed methods.