🤖 AI Summary
This work addresses the challenge posed by multimodal disinformation, whose dynamic evolution and time-sensitive nature often lead existing detection methods to retrieve redundant or irrelevant evidence. To mitigate this issue, the authors propose ExDR, a novel framework that integrates model-generated explanations into both retrieval triggering and evidence selection. Specifically, ExDR dynamically initiates retrieval based on multidimensional trigger confidence, constructs a deception-aware entity index, and retrieves contrastive evidence guided by deceptive features to challenge the original claim. Experimental results on the AMG and MR2 datasets demonstrate that ExDR significantly improves retrieval accuracy, evidence quality, and overall detection performance, exhibiting strong generalization capabilities.
📝 Abstract
The rapid spread of multimodal fake news poses a serious societal threat, as its evolving nature and reliance on timely factual details challenge existing detection methods. Dynamic Retrieval-Augmented Generation provides a promising solution by triggering keyword-based retrieval and incorporating external knowledge, thus enabling both efficient and accurate evidence selection. However, it still faces challenges in addressing issues such as redundant retrieval, coarse similarity, and irrelevant evidence when applied to deceptive content. In this paper, we propose ExDR, an Explanation-driven Dynamic Retrieval-Augmented Generation framework for Multimodal Fake News Detection. Our framework systematically leverages model-generated explanations in both the retrieval triggering and evidence retrieval modules. It assesses triggering confidence from three complementary dimensions, constructs entity-aware indices by fusing deceptive entities, and retrieves contrastive evidence based on deception-specific features to challenge the initial claim and enhance the final prediction. Experiments on two benchmark datasets, AMG and MR2, demonstrate that ExDR consistently outperforms previous methods in retrieval triggering accuracy, retrieval quality, and overall detection performance, highlighting its effectiveness and generalization capability.