Causal Learning Should Embrace the Wisdom of the Crowd

📅 2026-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of learning causal structures—such as directed acyclic graphs (DAGs)—from observational data, a task hindered by combinatorial explosion and observational ambiguity that impede accurate recovery of the global causal graph. To overcome these limitations, the paper proposes a novel human–AI collaborative paradigm that frames causal discovery as a distributed decision-making problem, integrating fragmented knowledge from human experts and large language model (LLM) agents. By leveraging scalable crowdsourcing, interactive knowledge elicitation, expert opinion modeling, robust aggregation algorithms, and LLM-driven simulation, the approach systematically synthesizes diverse insights to effectively reconstruct complete causal structures. This framework transcends the capacity constraints of individual agents, offering a scalable and practical pathway toward more accurate and efficient causal discovery.

Technology Category

Application Category

📝 Abstract
Learning causal structures typically represented by directed acyclic graphs (DAGs) from observational data is notoriously challenging due to the combinatorial explosion of possible graphs and inherent ambiguities in observations. This paper argues that causal learning is now ready for the emergence of a new paradigm supported by rapidly advancing technologies, fulfilling the long-standing vision of leveraging human causal knowledge. This paradigm integrates scalable crowdsourcing platforms for data collection, interactive knowledge elicitation for expert opinion modeling, robust aggregation techniques for expert reconciliation, and large language model (LLM)-based simulation for augmenting AI-driven information acquisition. In this paper, we focus on DAG learning for causal discovery and frame the problem as a distributed decision-making task, recognizing that each participant (human expert or LLM agent) possesses fragmented and imperfect knowledge about different subsets of the variables of interest in the causal graph. By proposing a systematic framework to synthesize these insights, we aim to enable the recovery of a global causal structure unachievable by any individual agent alone.We advocate for a new research frontier and outline a comprehensive framework for new research thrusts that range from eliciting, modeling, aggregating, and optimizing human causal knowledge contributions.
Problem

Research questions and friction points this paper is trying to address.

causal learning
directed acyclic graph
crowdsourcing
expert knowledge
causal discovery
Innovation

Methods, ideas, or system contributions that make the work stand out.

causal discovery
crowdsourcing
large language models
knowledge aggregation
directed acyclic graphs
R
Ryan Feng Lin
Department of Industrial and Systems Engineering, University of Washington, Seattle, WA 98195, USA
Y
Yuantao Wei
Department of Industrial and Systems Engineering, University of Washington, Seattle, WA 98195, USA
H
Huiling Liao
Department of Applied Mathematics, Illinois Institute of Technology
Xiaoning Qian
Xiaoning Qian
Texas A&M University
Computational network biologygenomic signal processingbiomedical image analysis
Shuai Huang
Shuai Huang
University of Washington
Statistical Modeling and AnalysisMachine LearningHealthcareManufacturing