Fine-grained and Explainable Factuality Evaluation for Multimodal Summarization

📅 2024-02-18

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

160K/year

🤖 AI Summary

To address factual inconsistency in multimodal summarization, this paper proposes the first fine-grained, interpretable dual-path factuality evaluation framework—jointly supporting reference-dependent supervised evaluation and reference-free open-scenario assessment. Methodologically, it integrates multimodal alignment modeling, cross-modal factual verification, and explainable score decomposition to enable error localization and natural-language explanation generation. Evaluated across multiple benchmarks, the framework substantially outperforms conventional metrics (e.g., BLEU, ROUGE) and achieves a 32% improvement in correlation with human judgments. The code and dataset are publicly released, establishing a new paradigm and practical toolkit for factuality research in multimodal summarization.

Technology Category

Application Category

📝 Abstract

Multimodal summarization aims to generate a concise summary based on the input text and image. However, the existing methods potentially suffer from unfactual output. To evaluate the factuality of multimodal summarization models, we propose two fine-grained and explainable evaluation frameworks (FALLACIOUS) for different application scenarios, i.e. reference-based factuality evaluation framework and reference-free factuality evaluation framework. Notably, the reference-free factuality evaluation framework doesn't need ground truth and hence it has a wider application scenario. To evaluate the effectiveness of the proposed frameworks, we compute the correlation between our frameworks and the other metrics. The experimental results show the effectiveness of our proposed method. We will release our code and dataset via github.

Problem

Research questions and friction points this paper is trying to address.

Evaluating factuality in multimodal summarization outputs

Addressing unfactual content in text-image summary generation

Providing explainable frameworks for reference-based and reference-free evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-grained explainable evaluation frameworks FALLACIOUS

Reference-based and reference-free factuality assessment methods

No ground truth required for wider application scenarios

🔎 Similar Papers

No similar papers found.