Inference from multivariate differential recruitment in respondent-driven sampling data

📅 2026-04-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

225K/year
🤖 AI Summary
Traditional respondent-driven sampling (RDS) inference typically assumes random recruitment, thereby overlooking the reality that individuals differentially recruit peers based on multidimensional covariates—encompassing both categorical and continuous variables—which can induce estimation bias. This work proposes a multivariate differential recruitment (MDR) framework that, for the first time, jointly models the influence of multidimensional covariates on recruitment behavior and formalizes the RDS process as a Markov process dependent on node- or edge-level covariates. Building on this foundation, the authors extend prevalence estimation methods and integrate an improved neighborhood bootstrap procedure for variance estimation. Simulation studies demonstrate the robust performance of the proposed approach across diverse network structures and sampling configurations, and it is successfully applied to real-world RDS survey data collected from Venezuelan migrants in Santiago, Chile.

Technology Category

Application Category

📝 Abstract
Respondent-Driven Sampling (RDS) is a chain-referral design used for collecting data from hidden or hard-to-reach populations through their social networks. In RDS, respondents recruit their peers from the population of interest. As such, inference with RDS data commonly relies on estimated sampling probabilities derived from specific recruitment assumptions. Early literature assumes random recruitment, which is often unrealistic because individuals may recruit based on their personal preferences. This behavior is known as Differential Recruitment (DR). Recent works have incorporated univariate categorical DR in the estimation procedures. The main objective of this paper is to introduce Multivariate Differential Recruitment (MDR), a framework that incorporates multiple simultaneous covariates, both categorical and continuous, into the sampling representation. We model RDS as a Markov process with transition probabilities that depend on continuous or categorical variables associated with nodes or their ties. We then extend various prevalence estimators to this multivariate framework and implement a slightly modified neighborhood bootstrap for variance estimation. The proposed methodology is assessed through simulation studies for a range of network and sampling features. It is applied to an RDS study conducted among the adult Venezuelan population living in the Metropolitan Region of Santiago, Chile.
Problem

Research questions and friction points this paper is trying to address.

Respondent-Driven Sampling
Differential Recruitment
Multivariate Covariates
Sampling Bias
Hidden Populations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multivariate Differential Recruitment
Respondent-Driven Sampling
Markov process
Neighborhood bootstrap
Prevalence estimation
🔎 Similar Papers
No similar papers found.