From Clouds to Hallucinations: Atmospheric Retrieval Hijacking in Remote Sensing Vision-Language RAG

📅 2026-05-08
📈 Citations: 0
Influential: 0
📄 PDF

career value

200K/year
🤖 AI Summary
This work addresses the underexplored vulnerability of vision-language retrieval in remote sensing multimodal RAG systems to input-level attacks, particularly those mimicking natural atmospheric disturbances. We propose CloudWeb, a novel attack method that overlays learnable cloud-like patterns onto remote sensing images to steer retrievers toward returning target weather-related evidence, thereby manipulating downstream generation outputs. CloudWeb is the first to demonstrate how naturalistic atmospheric perturbations can be leveraged to hijack the retrieval process, inducing weather hallucinations and semantic drift. By optimizing parameterized cloud patterns with a retrieval-oriented objective—encompassing target attraction, source suppression, rank separation, and naturalness regularization—CloudWeb substantially improves weather evidence recall across seven benchmark datasets, elevating Weather@5 for GeoRSCLIP ViT-B/32 from 0.71% to 43.29%.
📝 Abstract
Multimodal RAG systems increasingly rely on vision-language retrievers to ground visual queries in external textual evidence. Existing adversarial studies on RAG mainly manipulate the retrieval corpus or memory, while attacks on vision-language and remote sensing models typically target end-task predictions. Input-space threats to the evidence retrieval stage of remote sensing multimodal RAG remain underexplored. To address this gap, we introduce CloudWeb, an atmospheric retrieval hijacking attack that modifies only the input image while keeping the retriever, generator, and knowledge base fixed at deployment. CloudWeb overlays parameterized cloud- and haze-like patterns on remote sensing images and optimizes them with a retrieval-oriented objective that pulls adversarial image embeddings toward target atmospheric evidence, suppresses source-scene evidence, enforces rank separation, and regularizes naturalness and coverage. To the best of our knowledge, this is the first study of retrieval-stage atmospheric evidence hijacking in remote sensing multimodal RAG. We evaluate CloudWeb on a seven-dataset remote sensing RAG benchmark with five CLIP-style retrievers, including GeoRSCLIP, RemoteCLIP, OpenAI CLIP, and OpenCLIP, together with downstream vision-language generators. Across retrievers, CloudWeb consistently outperforms clean retrieval, handcrafted atmospheric baselines, random cloud perturbations, and fixed variants in injecting weather-related evidence into top-ranked results. On GeoRSCLIP ViT-B/32, Weather@5 increases from 0.71\% to 43.29\%. Downstream generation further shows measurable weather hallucination and semantic shift, indicating that retrieval-stage hijacking can propagate to the final RAG response. These findings reveal a practical failure mode: natural-looking atmospheric changes can compromise evidence retrieval before generation begins.
Problem

Research questions and friction points this paper is trying to address.

atmospheric retrieval hijacking
remote sensing
multimodal RAG
vision-language retrieval
adversarial attack
Innovation

Methods, ideas, or system contributions that make the work stand out.

atmospheric retrieval hijacking
remote sensing RAG
vision-language retrieval
adversarial image perturbation
evidence hallucination
🔎 Similar Papers
No similar papers found.
J
Jiaju Han
China University of Petroleum, Beijing at Karamay, Karamay, Xinjiang 834000, China
C
Chao Li
China University of Petroleum, Beijing at Karamay, Karamay, Xinjiang 834000, China
C
Chengyin Hu
China University of Petroleum, Beijing at Karamay, Karamay, Xinjiang 834000, China
Q
Qike Zhang
China University of Petroleum, Beijing at Karamay, Karamay, Xinjiang 834000, China
X
Xuemeng Sun
China University of Petroleum, Beijing at Karamay, Karamay, Xinjiang 834000, China
Xin Wang
Xin Wang
China Agricultural University
MechatronicsAutomationSensorsRobotics
F
Fengyu Zhang
China University of Petroleum, Beijing at Karamay, Karamay, Xinjiang 834000, China
X
Xiang Chen
China University of Petroleum, Beijing at Karamay, Karamay, Xinjiang 834000, China
Y
Yiwei Wei
China University of Petroleum, Beijing at Karamay, Karamay, Xinjiang 834000, China
J
Jiahuan Long
China University of Petroleum, Beijing at Karamay, Karamay, Xinjiang 834000, China
J
Jiujiang Guo
China University of Petroleum, Beijing at Karamay, Karamay, Xinjiang 834000, China