RSHallu: Dual-Mode Hallucination Evaluation for Remote-Sensing Multimodal Large Language Models with Domain-Tailored Mitigation

📅 2026-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the critical yet underexplored issue of hallucination—where model outputs contradict input imagery—in multimodal large language models for remote sensing, which severely limits their deployment in high-stakes applications. The work introduces the first formal definition of image-level hallucination in remote sensing and presents RSHalluEval, a benchmark comprising 2,023 question-answer pairs, alongside RSHalluCheck, a training set with 15,396 samples. A dual-mode evaluation framework combining cloud-based auditing and local reproducibility is proposed. Furthermore, the authors develop plug-and-play mitigation strategies, including training-free logit correction and remote sensing-aware prompting, as well as a lightweight checker fine-tuning approach. Under a unified evaluation protocol, these methods improve the hallucination-free rate of state-of-the-art models by up to 21.63 percentage points while maintaining competitive performance on downstream tasks such as RSVQA and RSVG.

Technology Category

Application Category

📝 Abstract
Multimodal large language models (MLLMs) are increasingly adopted in remote sensing (RS) and have shown strong performance on tasks such as RS visual grounding (RSVG), RS visual question answering (RSVQA), and multimodal dialogue. However, hallucinations, which are responses inconsistent with the input RS images, severely hinder their deployment in high-stakes scenarios (e.g., emergency management and agricultural monitoring) and remain under-explored in RS. In this work, we present RSHallu, a systematic study with three deliverables: (1) we formalize RS hallucinations with an RS-oriented taxonomy and introduce image-level hallucination to capture RS-specific inconsistencies beyond object-centric errors (e.g., modality, resolution, and scene-level semantics); (2) we build a hallucination benchmark RSHalluEval (2,023 QA pairs) and enable dual-mode checking, supporting high-precision cloud auditing and low-cost reproducible local checking via a compact checker fine-tuned on RSHalluCheck dataset (15,396 QA pairs); and (3) we introduce a domain-tailored dataset RSHalluShield (30k QA pairs) for training-friendly mitigation and further propose training-free plug-and-play strategies, including decoding-time logit correction and RS-aware prompting. Across representative RS-MLLMs, our mitigation improves the hallucination-free rate by up to 21.63 percentage points under a unified protocol, while maintaining competitive performance on downstream RS tasks (RSVQA/RSVG). Code and datasets will be released.
Problem

Research questions and friction points this paper is trying to address.

hallucination
remote sensing
multimodal large language models
RSVQA
RSVG
Innovation

Methods, ideas, or system contributions that make the work stand out.

hallucination evaluation
remote sensing MLLMs
dual-mode checking
domain-tailored mitigation
image-level hallucination
🔎 Similar Papers
No similar papers found.
Z
Zihui Zhou
College of Computer Science, Chongqing University, No. 55, South University Road, High-tech Zone, Chongqing, 401331, China, and Heavy Rainfall Research Center of China, No. 3, Donghu East Road, Hongshan District, Wuhan, 430074, Hubei, China
Yong Feng
Yong Feng
Swinburne University of Technology, Australia
Sliding Mode Control - Electrical Engineering - Control and Observers
Y
Yanying Chen
CMA Key Open Laboratory of Transforming Climate Resources to Economy, Chongqing Institute of Meteorological Sciences, No. 68, Xinpaofang 1st Road, Yubei District, Chongqing, 401147, China
G
Guofan Duan
Chongqing Metropolitan College of Science and Technology, No. 368, Guangcai Avenue, Yongchuan District, Chongqing, 402167, China
Zhenxi Song
Zhenxi Song
Unknown affiliation
AI for NeuroscienceBrain-Computer InterfaceEEG/MRI Analysis
M
Mingliang Zhou
College of Computer Science, Chongqing University, No. 55, South University Road, High-tech Zone, Chongqing, 401331, China, and Heavy Rainfall Research Center of China, No. 3, Donghu East Road, Hongshan District, Wuhan, 430074, Hubei, China
Weijia Jia
Weijia Jia
FIEEE, Chair Professor, Beijing Normal University and UIC
Cyber Intelligent ComputingNetworking