🤖 AI Summary
High-dimensional treatments in AI-generated content and human-AI collaboration scenarios lead to semantic loss and infeasibility of causal inference. Method: We propose a dual-kernel representation learning framework that jointly models treatment policies and user covariates in a low-dimensional, semantically interpretable kernel space, enabling scalable causal inference. Built upon a low-rank factor model, the method learns compact, robust treatment embeddings via alternating minimization and supports generative-model-guided variant synthesis and adaptive online assignment. Contribution/Results: Numerical experiments demonstrate a 32% average reduction in estimation error for treatment effects and a 2.1× improvement in online assignment efficiency. This work establishes the first representation learning paradigm for LLM-driven digital experiments that simultaneously ensures semantic fidelity and causal interpretability.
📝 Abstract
Large Language Models (LLMs) enable a new form of digital experimentation where treatments combine human and model-generated content in increasingly sophisticated ways. The main methodological challenge in this setting is representing these high-dimensional treatments without losing their semantic meaning or rendering analysis intractable. Here, we address this problem by focusing on learning low-dimensional representations that capture the underlying structure of such treatments. These representations enable downstream applications such as guiding generative models to produce meaningful treatment variants and facilitating adaptive assignment in online experiments. We propose double kernel representation learning, which models the causal effect through the inner product of kernel-based representations of treatments and user covariates. We develop an alternating-minimization algorithm that learns these representations efficiently from data and provides convergence guarantees under a low-rank factor model. As an application of this framework, we introduce an adaptive design strategy for online experimentation and demonstrate the method's effectiveness through numerical experiments.