Addressing both variable selection and misclassified responses with parametric and semiparametric methods

📅 2026-03-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of variable selection in binary classification when the response variable is subject to misclassification (measurement error). By leveraging auxiliary validation data, the authors propose, for the first time, both parametric and semiparametric penalized estimation methods that possess the oracle property. Through carefully designed penalty functions and regularization parameters, the proposed approaches effectively correct for misclassification bias while achieving consistent variable selection. Theoretical analysis rigorously establishes the oracle property of the resulting estimators, confirming their asymptotic optimality. Extensive numerical experiments further demonstrate the superior finite-sample performance of the methods compared to existing alternatives, highlighting their practical utility in settings with error-prone outcome measurements.

Technology Category

Application Category

📝 Abstract
While variable selection has received extensive attention in the literature, its exploration in the presence of response measurement error remains underexplored. In this paper, we investigate this important problem within the context of binary classification with error-prone responses. We present valid variable selection procedures to address the complexities of response errors. Leveraging validation data, we introduce both parametric and semiparametric methodologies to accommodate the mismeasurement effects. By rigorously establishing theoretical results, we offer insights and justifications of the validity of the proposed methods. By properly choosing {the} penalty function and regularization parameter, we demonstrate that the resulting estimators possess the oracle property. To assess the finite sample properties of the proposed methods, we conduct numerical studies that confirm the effectiveness of our proposed methods.
Problem

Research questions and friction points this paper is trying to address.

variable selection
misclassified responses
measurement error
binary classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

variable selection
response misclassification
validation data
semiparametric method
oracle property
🔎 Similar Papers
No similar papers found.
H
Hui Guo
Department of Computer Science, University of Western Ontario, Ontario, Canada
G
Grace Y. Yi
Department of Computer Science, University of Western Ontario, Ontario, Canada; Department of Statistical and Actuarial Sciences, University of Western Ontario, Ontario, Canada
Boyu Wang
Boyu Wang
Department of Computer Science, University of Western Ontario
machine learningmachine learning applicationscomputational neurosciencebiomedical engineering