🤖 AI Summary
Traditional information retrieval prioritizes topical relevance, which often fails to meet the practical utility demands of downstream large language model (LLM) tasks. This work proposes a utility-centered retrieval paradigm that shifts the objective from relevance to the actual contribution of retrieved content to LLM generation quality. It introduces the first unified framework encompassing diverse utility forms—spanning LLM-agnostic and LLM-aware, as well as context-independent and context-dependent settings—and explicitly links LLM information needs with agent-based RAG mechanisms. By integrating retrieval-augmented generation, information need modeling, utility-oriented evaluation metrics, and intelligent retrieval strategies, this study redefines retrieval evaluation criteria and establishes a synergistic optimization pathway between retrieval and generation, offering both theoretical foundations and practical guidance for information retrieval in the LLM era.
📝 Abstract
Information retrieval systems have traditionally optimized for topical relevance-the degree to which retrieved documents match a query. However, relevance only approximates a deeper goal: utility, namely, whether retrieved information helps accomplish a user's underlying task. The emergence of retrieval-augmented generation (RAG) fundamentally changes this paradigm. Retrieved documents are no longer consumed directly by users but instead serve as evidence for large language models (LLMs) that produce answers. As a result, retrieval effectiveness must be evaluated by its contribution to generation quality rather than by relevance-based ranking metrics alone. This tutorial argues that retrieval objectives are evolving from relevance-centric optimization toward LLM-centric utility. We present a unified framework covering LLM-agnostic versus LLM-specific utility, context-independent versus context-dependent utility, and the connection with LLM information needs and agentic RAG. By synthesizing recent advances, the tutorial provides conceptual foundations and practical guidance for designing retrieval systems aligned with the requirements of LLM-based information access.