Deep Neural Networks for Doubly Robust Estimation with Nonprobability Survey Samples

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This study addresses the dual challenges of non-probability samples lacking population representativeness and probability samples missing target variables by proposing a deep neural network–assisted doubly robust estimation framework. The method flexibly models the log sampling scores of non-probability samples using deep neural networks, constructs a pseudo-likelihood function with the aid of a reference probability sample, and maximizes it to derive inverse probability weighting and doubly robust estimators. Theoretical analysis establishes the consistency and convergence rate of the proposed estimators. Simulation and empirical results demonstrate that under nonlinear selection mechanisms, the approach substantially outperforms conventional parametric methods and exhibits enhanced robustness to model misspecification.

📝 Abstract

Integrating probability and nonprobability survey samples is an important problem in modern survey sampling. Nonprobability samples often contain rich outcome information but may lack population representativeness, whereas probability samples provide design-based auxiliary information but may not contain the study variable. We propose a deep neural network (DNN)-assisted doubly robust framework for estimating the finite population mean from these two data sources. The proposed method models the logit sampling score for the nonprobability sample as an unknown nonparametric function and estimates it by maximizing a pseudo-likelihood that combines information from the nonprobability sample and a reference probability sample. The DNN parameters are optimized using the ADAM algorithm. The resulting DNN-estimated sampling scores are incorporated into a DNN-assisted inverse-probability weighted estimator and a deep doubly robust estimator. We establish consistency and convergence rates under regularity conditions and evaluate the finite-sample performance of the proposed estimators through simulation studies and an empirical application using Pew Research Center and Behavioral Risk Factor Surveillance System data. The results suggest that the proposed estimators can improve robustness to parametric propensity-score misspecification, especially when the true selection mechanism is nonlinear.

Problem

Research questions and friction points this paper is trying to address.

nonprobability survey samples

doubly robust estimation

finite population mean

sampling score

data integration

Innovation

Methods, ideas, or system contributions that make the work stand out.

deep neural networks

doubly robust estimation

nonprobability sampling