🤖 AI Summary
This work proposes an interpretable and evolvable structured reasoning framework to address the poor interpretability of black-box models and their limited adaptability to novel manipulation techniques in multimodal misinformation detection. The approach leverages multimodal large language models to automatically discover and validate high-level semantic concepts, constructs human-understandable concept graphs, and generates transparent decision chains through a hierarchical attention mechanism coupled with probabilistic graphical inference. Experimental results demonstrate that the framework achieves state-of-the-art detection accuracy, significantly enhances robustness against unseen manipulation types, and enables fine-grained identification of specific manipulation strategies.
📝 Abstract
Multimodal misinformation poses an escalating challenge that often evades traditional detectors, which are opaque black boxes and fragile against new manipulation tactics. We present Probabilistic Concept Graph Reasoning (PCGR), an interpretable and evolvable framework that reframes multimodal misinformation detection (MMD) as structured and concept-based reasoning. PCGR follows a build-then-infer paradigm, which first constructs a graph of human-understandable concept nodes, including novel high-level concepts automatically discovered and validated by multimodal large language models (MLLMs), and then applies hierarchical attention over this concept graph to infer claim veracity. This design produces interpretable reasoning chains linking evidence to conclusions. Experiments demonstrate that PCGR achieves state-of-the-art MMD accuracy and robustness to emerging manipulation types, outperforming prior methods in both coarse detection and fine-grained manipulation recognition.