🤖 AI Summary
This study addresses the absence of an effective evaluation framework for structured generative search summaries—comprising overviews, titled sections, and cited source documents—that appear at the top of natural search results. It presents the first systematic effort to construct a comprehensive evaluation framework tailored to these summaries, explicitly defining their core components and multidimensional assessment criteria. By integrating large language model generation techniques with established information retrieval evaluation methodologies, the work proposes a practical and scalable evaluation framework and outlines a clear empirical validation pathway. This contribution establishes a foundational methodological basis for future research on generative search summaries and their impact on user experience and information access.
📝 Abstract
We propose a framework for evaluating structured generative search summaries that are placed atop organic web search results. A structured summary, generated by a large language model, typically consists of an overview, several sections with section titles, and a list of source documents that are cited within the summary. We then describe our plans for implementing and evaluating the framework.