Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses a critical vulnerability in retrieval-augmented generation (RAG) systems: their susceptibility to knowledge poisoning attacks. While existing defenses focus solely on detecting contaminated evidence, they overlook the “monitoring-control gap”—the phenomenon where models, despite recognizing contradictions, still incorporate erroneous information into their outputs. To bridge this gap, the authors propose the Cordon principle, reframing RAG defense as an information flow control problem. Their approach employs a multi-agent architecture that isolates evidence extraction, cross-source auditing, and answer synthesis, complemented by an agent isolation mechanism based on asymmetric memory permissions. This design explicitly prohibits components with final answer synthesis capabilities from directly accessing untrusted natural language evidence. Evaluated across five BEIR datasets, the method reduces attack success rates by 92.4% compared to unprotected RAG systems.

📝 Abstract

Retrieval-augmented generation (RAG) increasingly underpins high-stakes applications, yet remains vulnerable to Confundo-style poisoning where adversarially optimized documents manipulate generated outputs. Existing defenses assume that detecting poisoned evidence prevents harm. We show this assumption is incorrect: models exhibit a monitoring-control gap -- they can detect contradictions in retrieved evidence yet still act on poisoned claims. We introduce the Cordon Principle -- no agent capable of final synthesis may access untrusted natural-language evidence -- and realize it through CORDON-MAS, a compartmentalized framework that enforces this principle architecturally by separating evidence extraction, cross-source audit, and answer synthesis into agents with asymmetric memory privileges. Across five BEIR datasets, CORDON-MAS reduces attack success rate by 92.4\% relative to undefended RAG. This reframes RAG poisoning from a detection problem to an information-flow control problem.

Problem

Research questions and friction points this paper is trying to address.

Retrieval-augmented generation

Knowledge Poisoning

Information-Flow Control

Adversarial Attacks

RAG Security

Innovation

Methods, ideas, or system contributions that make the work stand out.

information-flow control

retrieval-augmented generation

knowledge poisoning