CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language Models

📅 2025-05-25

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

Existing hallucination research focuses narrowly on either cross-lingual or cross-modal dimensions in isolation, lacking systematic investigation of their joint occurrence. Method: We introduce CCHall, the first benchmark for joint cross-lingual and cross-modal hallucination detection—explicitly defining and evaluating hallucinations in large language models (LLMs) under multilingual-multimodal mixed inputs. Built upon adversarial test sets derived from multilingual text–image pairs, CCHall integrates human verification with automated metrics to establish a rigorous evaluation protocol. Results: Experiments across leading open- and closed-source LLMs reveal significantly elevated hallucination rates and severely limited generalization in this joint setting. CCHall bridges a critical gap in multidimensional hallucination assessment, providing both a foundational benchmark and an analytical framework to enhance the robustness of multilingual multimodal models.

Technology Category

Application Category

📝 Abstract

Investigating hallucination issues in large language models (LLMs) within cross-lingual and cross-modal scenarios can greatly advance the large-scale deployment in real-world applications. Nevertheless, the current studies are limited to a single scenario, either cross-lingual or cross-modal, leaving a gap in the exploration of hallucinations in the joint cross-lingual and cross-modal scenarios. Motivated by this, we introduce a novel joint Cross-lingual and Cross-modal Hallucinations benchmark (CCHall) to fill this gap. Specifically, CCHall simultaneously incorporates both cross-lingual and cross-modal hallucination scenarios, which can be used to assess the cross-lingual and cross-modal capabilities of LLMs. Furthermore, we conduct a comprehensive evaluation on CCHall, exploring both mainstream open-source and closed-source LLMs. The experimental results highlight that current LLMs still struggle with CCHall. We hope CCHall can serve as a valuable resource to assess LLMs in joint cross-lingual and cross-modal scenarios.

Problem

Research questions and friction points this paper is trying to address.

Detecting hallucinations in joint cross-lingual and cross-modal LLMs

Assessing LLMs' capabilities in cross-lingual and cross-modal scenarios

Evaluating mainstream LLMs on a novel benchmark (CCHall)

Innovation

Methods, ideas, or system contributions that make the work stand out.

Joint cross-lingual and cross-modal benchmark

Comprehensive evaluation of LLMs

Detects hallucinations in multilingual and multimodal scenarios

🔎 Similar Papers

Hallucination of Multimodal Large Language Models: A Survey