Integrating Chain-of-Thought into Generative Retrieval: A Preliminary Study

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

This work addresses the limitations of existing generative retrieval methods, which directly map queries to document IDs without intermediate reasoning and thus struggle with complex multi-hop queries. To overcome this, the authors propose ThinkGR, a novel framework that integrates chain-of-thought reasoning into generative retrieval for the first time, enabling interleaved iterative reasoning and document ID generation within a single decoding pass. By combining a hybrid decoding strategy with a two-stage training approach—supervised fine-tuning followed by retrieval-based reinforcement learning—the method unifies free-form reasoning with structured retrieval. Evaluated on four multi-hop retrieval benchmarks, ThinkGR achieves state-of-the-art performance, yielding an average improvement of 6.86% and significantly enhancing retrieval effectiveness for complex queries.

📝 Abstract

While generative retrieval (GR) demonstrates competitive performance on standard retrieval benchmarks, existing approaches directly map queries to document identifiers (docids) without intermediate deliberation, limiting their effectiveness for complex queries that require multi-step reasoning. As a preliminary study on integrating chain-of-thought (CoT) into generative retrieval, we introduce ThinkGR, a unified framework that interleaves CoT with docid generation, enabling iterative thinking and retrieval within a single generative process. To bridge the gap between free-form thought generation and structured retrieval targets, we design (1) a hybrid decoding strategy that dynamically switches between unconstrained thought generation and constrained docid decoding, and (2) a two-phase training approach that first aligns thought-retrieval patterns through supervised fine-tuning, then optimizes thought quality via retrieval-grounded reinforcement learning. Experiments on four multi-hop retrieval benchmarks demonstrate that ThinkGR achieves state-of-the-art performance with an average improvement of +6.86\%. Our work opens new avenues for enhancing generative retrieval with explicit deliberation capabilities, with promising implications for retrieval tasks requiring complex reasoning.

Problem

Research questions and friction points this paper is trying to address.

generative retrieval

chain-of-thought

multi-hop retrieval

complex reasoning

docid generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

generative retrieval

chain-of-thought

hybrid decoding