๐ค AI Summary
Robust benchmarking of causal inference methods on real-world data has long been hindered by the scarcity of high-fidelity, controllable interventional simulation datasets. To address this, we propose Frengressionโa novel framework that integrates parsimonious parameterization with deep generative modeling to directly learn the joint distribution over covariates, interventions, and outcomes. This enables precise, consistent, and extrapolation-guaranteed estimation of causal marginals. Frengression supports high-fidelity generation of multivariate time-series data and permits direct sampling under arbitrary intervention distributions, substantially improving simulation flexibility and controllability. Empirical evaluation on real clinical trial data demonstrates that Frengression-synthesized data yield accurate causal effect estimates. The framework thus significantly enhances the practicality, scalability, and reproducibility of causal simulation studies.
๐ Abstract
Machine learning has revitalized causal inference by combining flexible models and principled estimators, yet robust benchmarking and evaluation remain challenging with real-world data. In this work, we introduce frengression, a deep generative realization of the frugal parameterization that models the joint distribution of covariates, treatments and outcomes around the causal margin of interest. Frengression provides accurate estimation and flexible, faithful simulation of multivariate, time-varying data; it also enables direct sampling from user-specified interventional distributions. Model consistency and extrapolation guarantees are established, with validation on real-world clinical trial data demonstrating frengression's practical utility. We envision this framework sparking new research into generative approaches for causal margin modelling.