CODE: A Contradiction-Based Deliberation Extension Framework for Overthinking Attacks on Retrieval-Augmented Generation

📅 2026-01-19

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work proposes the CODE framework to address a critical vulnerability in reasoning-enhanced retrieval-augmented generation (RAG) systems: their susceptibility to “overthinking” attacks, which induce excessive and redundant reasoning steps, leading to substantial computational overhead. The study is the first to reveal that RAG inherits overthinking risks from its underlying reasoning models and introduces an end-to-end attack methodology that leverages multi-agent collaboration to generate poisoned knowledge-base entries. These adversarial samples are logically contradictory, evidentially conflicting, stylistically diverse, and highly stealthy, yet they require no modification of user queries and preserve task accuracy while significantly elongating the reasoning chain. Experiments across two datasets and five mainstream commercial reasoning models demonstrate 5.32–24.72× increases in reasoning tokens, confirming both the effectiveness and stealthiness of the proposed attack.

Technology Category

Application Category

📝 Abstract

Introducing reasoning models into Retrieval-Augmented Generation (RAG) systems enhances task performance through step-by-step reasoning, logical consistency, and multi-step self-verification. However, recent studies have shown that reasoning models suffer from overthinking attacks, where models are tricked to generate unnecessarily high number of reasoning tokens. In this paper, we reveal that such overthinking risk can be inherited by RAG systems equipped with reasoning models, by proposing an end-to-end attack framework named Contradiction-Based Deliberation Extension (CODE). Specifically, CODE develops a multi-agent architecture to construct poisoning samples that are injected into the knowledge base. These samples 1) are highly correlated with the use query, such that can be retrieved as inputs to the reasoning model; and 2) contain contradiction between the logical and evidence layers that cause models to overthink, and are optimized to exhibit highly diverse styles. Moreover, the inference overhead of CODE is extremely difficult to detect, as no modification is needed on the user query, and the task accuracy remain unaffected. Extensive experiments on two datasets across five commercial reasoning models demonstrate that the proposed attack causes a 5.32x-24.72x increase in reasoning token consumption, without degrading task performance. Finally, we also discuss and evaluate potential countermeasures to mitigate overthinking risks.

Problem

Research questions and friction points this paper is trying to address.

overthinking attacks

Retrieval-Augmented Generation

reasoning models

adversarial poisoning

computational overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

overthinking attacks

retrieval-augmented generation

contradiction-based poisoning

reasoning token inflation

multi-agent adversarial generation

🔎 Similar Papers

No similar papers found.

Authors to Follow