Multi-Perspective Evidence Synthesis and Reasoning for Unsupervised Multimodal Entity Linking

๐Ÿ“… 2026-04-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

190K/year
๐Ÿค– AI Summary
This work addresses the limitations of existing unsupervised multimodal entity linking methods, which overly rely on instance-centric features while neglecting multi-perspective evidence and their intricate interdependencies. To overcome this, the authors propose MSR-MEL, a novel framework that integrates offline multi-view evidence synthesis with online large language model (LLM) reasoning. MSR-MEL innovatively introduces group-level evidence, leverages contextualized graph structures to aggregate neighborhood information, and combines an asymmetric teacherโ€“student graph neural network with LLM-based semantic reasoning. Evaluated on standard benchmarks, MSR-MEL substantially outperforms current unsupervised approaches, achieving significantly higher entity linking accuracy.

Technology Category

Application Category

๐Ÿ“ Abstract
Multimodal Entity Linking (MEL) is a fundamental task in data management that maps ambiguous mentions with diverse modalities to the multimodal entities in a knowledge base. However, most existing MEL approaches primarily focus on optimizing instance-centric features and evidence, leaving broader forms of evidence and their intricate interdependencies insufficiently explored. Motivated by the observation that human expert decision-making process relies on multi-perspective judgment, in this work, we propose MSR-MEL, a Multi-perspective Evidence Synthesis and Reasoning framework with Large Language Models (LLMs) for unsupervised MEL. Specifically, we adopt a two-stage framework: (1) Offline Multi-Perspective Evidence Synthesis constructs a comprehensive set of evidence. This includes instance-centric evidence capturing the instance-centric multimodal information of mentions and entities, group-level evidence that aggregates neighborhood information, lexical evidence based on string overlap ratio, and statistical evidence based on simple summary statistics. A core contribution of our framework is the synthesis of group-level evidence, which effectively aggregates vital neighborhood information by graph. We first construct LLM-enhanced contextualized graphs. Subsequently, different modalities are jointly aligned through an asymmetric teacher-student graph neural network. (2) Online Multi-Perspective Evidence Reasoning leverages the power of LLM as a reasoning module to analyze the correlation and semantics of the multi-perspective evidence to induce an effective ranking strategy for accurate entity linking without supervision. Extensive experiments on widely used MEL benchmarks demonstrate that MSR-MEL consistently outperforms state-of-the-art unsupervised methods. The source code of this paper was available at: https://anonymous.4open.science/r/MSR-MEL-C21E/.
Problem

Research questions and friction points this paper is trying to address.

Multimodal Entity Linking
Unsupervised Learning
Evidence Synthesis
Multi-Perspective Reasoning
Knowledge Base
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-perspective Evidence Synthesis
Unsupervised Multimodal Entity Linking
Large Language Models
Graph Neural Networks
Evidence Reasoning
๐Ÿ”Ž Similar Papers
No similar papers found.