Using Multimodal and Language-Agnostic Sentence Embeddings for Abstractive Summarization

📅 2026-03-09

📈 Citations: 0

✨ Influential: 0

career value

135K/year

🤖 AI Summary

This work addresses the challenge of factual inaccuracies and hallucinations commonly observed in abstractive summarization for low-resource languages. To mitigate these issues, the authors propose SBARThez, a novel framework that integrates multimodal and language-agnostic sentence embeddings—leveraging LaBSE, SONAR, and BGE-M3—and incorporates a named entity injection mechanism to enhance factual consistency in generated summaries. SBARThez supports both textual and spoken inputs, enabling cross-lingual abstractive summarization. Experimental results demonstrate that the proposed approach significantly improves factual accuracy and conciseness of summaries in low-resource settings, achieving performance comparable to strong token-based baseline methods.

Technology Category

Application Category

📝 Abstract

Abstractive summarization aims to generate concise summaries by creating new sentences, allowing for flexible rephrasing. However, this approach can be vulnerable to inaccuracies, particularly `hallucinations'where the model introduces non-existent information. In this paper, we leverage the use of multimodal and multilingual sentence embeddings derived from pretrained models such as LaBSE, SONAR, and BGE-M3, and feed them into a modified BART-based French model. A Named Entity Injection mechanism that appends tokenized named entities to the decoder input is introduced, in order to improve the factual consistency of the generated summary. Our novel framework, SBARThez, is applicable to both text and speech inputs and supports cross-lingual summarization; it shows competitive performance relative to token-level baselines, especially for low-resource languages, while generating more concise and abstract summaries.

Problem

Research questions and friction points this paper is trying to address.

abstractive summarization

hallucination

factual consistency

multimodal embeddings

cross-lingual summarization

Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal sentence embeddings

language-agnostic representation

named entity injection