🤖 AI Summary
This work addresses the challenge that existing causal discovery methods struggle to effectively model the synergistic interplay of semantic, spatial, and temporal contexts in social media text, leading to inaccurate identification of causal relationships among disaster events. To this end, the authors propose CaST, a novel framework that uniquely integrates these three contextual dimensions into a dynamic event graph. Specifically, CaST leverages a disaster-pretrained large language model to extract semantic features, incorporates spatiotemporal coordinates to construct the event graph, and employs a multi-head graph attention network to learn directed causal dependencies. Evaluated on a newly curated dataset of 167,000 tweets related to Hurricane Harvey, CaST significantly outperforms state-of-the-art methods. Ablation studies further demonstrate that spatial and temporal contexts are crucial for improving recall and training stability, thereby enhancing the robustness and interpretability of causal discovery.
📝 Abstract
Understanding causality between real-world events from social media is essential for situational awareness, yet existing causal discovery methods often overlook the interplay between semantic, spatial, and temporal contexts. We propose CaST: Causal Discovery via Spatio-Temporal Graphs, a unified framework for causal discovery in disaster domain that integrates semantic similarity and spatio-temporal proximity using Large Language Models (LLMs) pretrained on disaster datasets. CaST constructs an event graph for each window of tweets. Each event extracted from tweets is represented as a node embedding enriched with its contextual semantics, geographic coordinates, and temporal features. These event nodes are then connected to form a spatio-temporal event graph, which is processed using a multi-head Graph Attention Network (GAT) \cite{gat} to learn directed causal relationships. We construct an in-house dataset of approximately 167K disaster-related tweets collected during Hurricane Harvey and annotated following the MAVEN-ERE schema. Experimental results show that CaST achieves superior performance over both traditional and state-of-the-art methods. Ablation studies further confirm that incorporating spatial and temporal signals substantially improves both recall and stability during training. Overall, CaST demonstrates that integrating spatio-temporal reasoning into event graphs enables more robust and interpretable causal discovery in disaster-related social media text.