Characterizing and Minimizing Divergent Delivery in Meta Advertising Experiments

📅 2025-08-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In Meta’s advertising A/B tests, divergent delivery—where algorithmic targeting automatically directs variants to distinct audience segments—confounds causal estimates of ad creative effects with compositional shifts in audience demographics and behavior, undermining internal validity. Method: Leveraging 3,204 lift experiments and 181,890 A/B tests, we conduct the first large-scale empirical investigation into this bias. We quantify distributional divergence across treatment and control groups and identify key configuration drivers (e.g., audience overlap constraints, budget allocation mechanisms). Contribution/Results: Lift tests exhibit no significant audience imbalance, whereas A/B tests show robust, measurable demographic and behavioral distribution shifts between variants. We propose actionable experimental design improvements grounded in these findings. Empirical validation demonstrates that optimized configurations substantially reduce delivery bias, enhancing both the unbiasedness of effect estimation and external validity. This work provides both theoretical grounding and practical guidelines for causal evaluation in industrial advertising systems.

Technology Category

Application Category

📝 Abstract
Many digital platforms offer advertisers experimentation tools like Meta's Lift and A/B tests to optimize their ad campaigns. Lift tests compare outcomes between users eligible to see ads versus users in a no-ad control group. In contrast, A/B tests compare users exposed to alternative ad configurations, absent any control group. The latter setup raises the prospect of divergent delivery: ad delivery algorithms may target different ad variants to different audience segments. This complicates causal interpretation because results may reflect both ad content effectiveness and changes to audience composition. We offer three key contributions. First, we make clear that divergent delivery is specific to A/B tests and intentional, informing advertisers about ad performance in practice. Second, we measure divergent delivery at scale, considering 3,204 Lift tests and 181,890 A/B tests. Lift tests show no meaningful audience imbalance, confirming their causal validity, while A/B tests show clear imbalance, as expected. Third, we demonstrate that campaign configuration choices can reduce divergent delivery in A/B tests, lessening algorithmic influence on results. While no configuration guarantees eliminating divergent delivery entirely, we offer evidence-based guidance for those seeking more generalizable insights about ad content in A/B tests.
Problem

Research questions and friction points this paper is trying to address.

Addressing divergent delivery in Meta A/B ad tests
Measuring audience imbalance across thousands of experiments
Reducing algorithmic influence on ad test results
Innovation

Methods, ideas, or system contributions that make the work stand out.

Measuring divergent delivery at scale
Using Lift tests for causal validity
Reducing algorithmic influence via configuration
🔎 Similar Papers
No similar papers found.