๐ค AI Summary
Existing approaches struggle to detect misleading content arising from the absence of critical background information, particularly when the text appears locally coherent. This work proposes a novel method that retrieves temporally aligned contextual articles and leverages instruction-tuned large language models to explicitly reconstruct the missing facts linking target sentences to their broader context. These cross-source factual relationships are modeled as edges in a heterogeneous graph, enabling joint reasoning via graph neural networks. Unlike prior strategies that merely append retrieved evidence or predict missingness signals, this approach uniquely integrates explicit missing-fact reconstruction with structured graph-based inference. Evaluated on bilingual (ChineseโEnglish) benchmarks, the method outperforms the strongest baseline by 2.56 and 2.84 macro-F1 points, respectively.
๐ Abstract
Automatic misinformation detection performs well when deception is visible in what an article explicitly states. However, some misinformation articles remain locally coherent and only become misleading once compared with contemporaneous reports that supply background facts the article omits. We study this omission-relevant setting and observe that current omission-aware approaches typically either attach retrieved context as auxiliary evidence or infer a categorical omission signal, leaving the specific missing fact implicit. We propose \emph{Latent Causal Void} (LCV), a retrieval-guided detector that explicitly reconstructs the missing fact for each target sentence and uses it as a textual cross-source relation in graph reasoning. Concretely, LCV retrieves temporally aligned context articles, asks a frozen instruction-tuned large language model to generate a short missing-context description for each sentence--article pair, and feeds the resulting relation text into a heterograph over target sentences and context articles. On the bilingual benchmark of Sheng et al., LCV improves over the strongest omission-aware baseline by $2.56$ and $2.84$ macro-F1 points on the English and Chinese splits, respectively. The results indicate that modeling the missing cross-source fact itself, rather than only attaching retrieved evidence or predicting an omission signal, is a useful representation for omission-aware misinformation detection.