🤖 AI Summary
This study addresses parameter estimation in binary choice models under severe class imbalance by proposing a support vector machine (SVM)-based approach that adjusts class weights to achieve consistent estimation of slope parameters under the linear conditional mean assumption, subsequently recovering the intercept. Theoretical analysis demonstrates that the proposed method is asymptotically equivalent to logistic regression and possesses desirable consistency properties. Finite-sample simulations indicate that its performance is comparable to that of quasi-maximum likelihood estimation (QMLE), with each method exhibiting relative strengths depending on the context. This work thus offers a novel theoretical perspective and a practical tool for modeling binary choice outcomes in imbalanced data settings.
📝 Abstract
The support vector machine (SVM) has an asymptotic behavior that parallels that of the quasi-maximum likelihood estimator (QMLE) for binary outcomes generated by a binary choice model (BCM), although it is not a QMLE. We show that, under the linear conditional mean condition for covariates given the systematic component used in the QMLE slope consistency literature, the slope of the separating hyperplane given by the SVM consistently estimates the BCM slope parameter, as long as the class weight is used as required when binary outcomes are severely imbalanced. The SVM slope estimator is asymptotically equivalent to that of logistic regression in this sense. The finite-sample performance of the two estimators can be quite distinct depending on the distributions of covariates and errors, but neither dominates the other. The intercept parameter of the BCM can be consistently estimated once a consistent estimator of its slope parameter is obtained.