🤖 AI Summary
To address the challenge of jointly estimating frontier functions and inefficiency terms in high-dimensional “wide data” settings—where conventional stochastic frontier analysis (SFA) methods suffer from model misspecification bias—we propose the first integration of post-double LASSO into SFA. Our approach constructs Neyman-orthogonal moment conditions, enabling doubly robust estimation of both the production frontier and the inefficiency term. This framework mitigates endogeneity and selection bias under high-dimensional sparse designs, while ensuring both model selection consistency and asymptotic normality of parameter estimates. Simulation studies and empirical applications demonstrate that our method significantly improves the accuracy, robustness, and statistical reliability of efficiency measurement compared to standard SFA and single-stage LASSO-based approaches. The proposed estimator thus provides a generalizable, high-dimensional econometric solution for efficiency evaluation in big-data contexts.
📝 Abstract
Big data and machine learning methods have become commonplace across economic milieus. One area that has not seen as much attention to these important topics yet is efficiency analysis. We show how the availability of big (wide) data can actually make detection of inefficiency more challenging. We then show how machine learning methods can be leveraged to adequately estimate the primitives of the frontier itself as well as inefficiency using the `post double LASSO' by deriving Neyman orthogonal moment conditions for this problem. Finally, an application is presented to illustrate key differences of the post-double LASSO compared to other approaches.