๐ค AI Summary
This work addresses the poor generalization of uplift estimation under combinatorial interventions, such as context-dependent policies. To this end, it proposes a policy representation aligned with causal semantics by modeling a policy as the mixture distribution it induces over contextโaction components and embedding it via permutation-invariant aggregation. Integrating an orthogonalized low-rank model with a vector-valued Robinson decomposition, the method effectively isolates pure incremental effects, substantially enhancing generalization to rare or previously unseen policies. Empirical evaluation on large-scale randomized experimental data from a real-world platform demonstrates that the proposed approach significantly improves both accuracy and stability of uplift estimation across overall and long-tail policy scenarios.
๐ Abstract
We study uplift estimation for combinatorial treatments. Uplift measures the pure incremental causal effect of an intervention (e.g., sending a coupon or a marketing message) on user behavior, modeled as a conditional individual treatment effect. Many real-world interventions are combinatorial: a treatment is a policy that specifies context-dependent action distributions rather than a single atomic label. Although recent work considers structured treatments, most methods rely on categorical or opaque encodings, limiting robustness and generalization to rare or newly deployed policies. We propose an uplift estimation framework that aligns treatment representation with causal semantics. Each policy is represented by the mixture it induces over contextaction components and embedded via a permutation-invariant aggregation. This representation is integrated into an orthogonalized low-rank uplift model, extending Robinson-style decompositions to learned, vector-valued treatments. We show that the resulting estimator is expressive for policy-induced causal effects, orthogonally robust to nuisance estimation errors, and stable under small policy perturbations. Experiments on large-scale randomized platform data demonstrate improved uplift accuracy and stability in long-tailed policy regimes