Zero-Shot Learning of Causal Models

📅 2024-10-08

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the limitation of structural causal models (SCMs) requiring dataset-specific generative model training. We propose the first zero-shot causal inference paradigm: given only empirical distributional representations of a new dataset—such as means, covariances, or neural features—the SCM is directly conditioned and inferred without fine-tuning or retraining. Our method builds upon a fixed-point meta-learning (FiP) framework, wherein SCM parameters are modeled as functions of empirical representations, enabling cross-dataset causal knowledge transfer. Experiments demonstrate that our approach matches the in-distribution and out-of-distribution performance of state-of-the-art models individually trained per dataset. Moreover, it supports zero-shot generation of both observational and interventional samples. This significantly enhances the generalizability, scalability, and practical applicability of causal discovery.

Technology Category

Application Category

📝 Abstract

With the increasing acquisition of datasets over time, we now have access to precise and varied descriptions of the world, encompassing a broad range of phenomena. These datasets can be seen as observations from unknown causal generative processes, commonly described by Structural Causal Models (SCMs). Recovering SCMs from observations poses formidable challenges, and often requires us to learn a specific generative model for each dataset. In this work, we propose to learn a emph{single} model capable of inferring the SCMs in a zero-shot manner. Rather than learning a specific SCM for each dataset, we enable the Fixed-Point Approach (FiP)~citep{scetbon2024fip} to infer the generative SCMs conditionally on their empirical representations. As a by-product, our approach can perform zero-shot generation of new dataset samples and intervened samples. We demonstrate via experiments that our amortized procedure achieves performances on par with SoTA methods trained specifically for each dataset on both in and out-of-distribution problems. To the best of our knowledge, this is the first time that SCMs are inferred in a zero-shot manner from observations, paving the way for a paradigmatic shift toward the assimilation of causal knowledge across datasets. The code is available on Github.

Problem

Research questions and friction points this paper is trying to address.

Infer Structural Causal Models zero-shot

Generalize across datasets with one model

Enable zero-shot generation of datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot learning of SCMs

Fixed-Point Approach application

Amortized causal model inference

🔎 Similar Papers

The Causal Information Bottleneck and Optimal Causal Variable Abstractions