MINED: Probing and Updating with Multimodal Time-Sensitive Knowledge for Large Multimodal Models

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing large multimodal models (LMMs) rely on static pretraining, limiting their ability to accurately comprehend time-sensitive factual knowledge; meanwhile, prevailing evaluation benchmarks lack dynamism and multidimensionality. Method: We introduce MINED, the first time-sensitive knowledge benchmark tailored for large vision-language models, comprising six dimensions and eleven tasks. Constructed from Wikipedia and validated by expert annotation, MINED integrates cognitive, reasoning, and robustness assessments and introduces a Composite Evaluation Metric (CEM) for holistic scoring. Contribution/Results: Evaluated across 15 state-of-the-art models, Gemini-2.5-Pro achieves the highest score (63.07), while open-source models consistently underperform. Organizational knowledge proves most modelable; sports-related knowledge is the most challenging. MINED systematically exposes critical capability gaps in dynamic fact understanding and empirically validates the feasibility of knowledge editing for timely factual updates.

Technology Category

Application Category

📝 Abstract
Large Multimodal Models (LMMs) encode rich factual knowledge via cross-modal pre-training, yet their static representations struggle to maintain an accurate understanding of time-sensitive factual knowledge. Existing benchmarks remain constrained by static designs, inadequately evaluating LMMs' ability to understand time-sensitive knowledge. To address this gap, we propose MINED, a comprehensive benchmark that evaluates temporal awareness along 6 key dimensions and 11 challenging tasks: cognition, awareness, trustworthiness, understanding, reasoning, and robustness. MINED is constructed from Wikipedia by two professional annotators, containing 2,104 time-sensitive knowledge samples spanning six knowledge types. Evaluating 15 widely used LMMs on MINED shows that Gemini-2.5-Pro achieves the highest average CEM score of 63.07, while most open-source LMMs still lack time understanding ability. Meanwhile, LMMs perform best on organization knowledge, whereas their performance is weakest on sport. To address these challenges, we investigate the feasibility of updating time-sensitive knowledge in LMMs through knowledge editing methods and observe that LMMs can effectively update knowledge via knowledge editing methods in single editing scenarios.
Problem

Research questions and friction points this paper is trying to address.

Evaluating temporal awareness in multimodal models using dynamic benchmarks
Addressing static knowledge limitations in large multimodal AI systems
Updating time-sensitive factual knowledge through knowledge editing methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal benchmark for time-sensitive knowledge evaluation
Knowledge editing methods for updating temporal information
Comprehensive assessment across six dimensions and tasks
🔎 Similar Papers
No similar papers found.
K
Kailin Jiang
University of Science and Technology of China
N
Ning Jiang
Northeast Forestry University
Yuchen Ren
Yuchen Ren
Renmin University of China
Y
Yuchen Li
Anhui Polytechnic University
Y
Yifan Gao
University of Science and Technology of China
Jinhe Bi
Jinhe Bi
LMU Munich
Efficient AIM/LLM
Yunpu Ma
Yunpu Ma
Ludwig Maximilian University of Munich
Foundation ModelsAgentic AITemporal Knowledge GraphQuantum AI
Q
Qingqing Liu
Beijing Institute of Technology
X
Xianhao Wang
University of Science and Technology of China
Y
Yifan Jia
Shandong University
Hongbo Jiang
Hongbo Jiang
Hunan University
Mobile ComputingWireless NetworkingPrivacy Preserving
Y
Yaocong Hu
Anhui Polytechnic University
B
Bin Li
University of Science and Technology of China
L
Lei Liu
University of Science and Technology of China
Yuntao Du
Yuntao Du
Purdue University
Privacy