🤖 AI Summary
To address the challenges of orchestrating heterogeneous components—large language models (LLMs), retrieval models, tools, and data sources—in modular generative information access (GenIA) systems, and their insufficient dynamic adaptability, this paper proposes a real-time adaptive orchestration architecture. We first systematically define the GenIA module interaction space and introduce a real-time self-organizing orchestration paradigm grounded in dual awareness of query semantics and module capabilities, enabling on-demand configuration, utility optimization, and cost control in unison. Core technical contributions include: (i) a meta-controller-driven lightweight runtime evaluation mechanism; (ii) fine-grained module capability profiling; (iii) dynamic routing policies; and (iv) feedback-driven online optimization. The resulting scalable orchestration framework articulates key design principles and implementation pathways, providing the information retrieval (IR) community with a blueprint for next-generation system architectures tailored to evolving AI infrastructure.
📝 Abstract
Advancements in large language models (LLMs) have driven the emergence of complex new systems to provide access to information, that we will collectively refer to as modular generative information access (GenIA) systems. They integrate a broad and evolving range of specialized components, including LLMs, retrieval models, and a heterogeneous set of sources and tools. While modularity offers flexibility, it also raises critical challenges: How can we systematically characterize the space of possible modules and their interactions? How can we automate and optimize interactions among these heterogeneous components? And, how do we enable this modular system to dynamically adapt to varying user query requirements and evolving module capabilities? In this perspective paper, we argue that the architecture of future modular generative information access systems will not just assemble powerful components, but enable a self-organizing system through real-time adaptive orchestration -- where components' interactions are dynamically configured for each user input, maximizing information relevance while minimizing computational overhead. We give provisional answers to the questions raised above with a roadmap that depicts the key principles and methods for designing such an adaptive modular system. We identify pressing challenges, and propose avenues for addressing them in the years ahead. This perspective urges the IR community to rethink modular system designs for developing adaptive, self-optimizing, and future-ready architectures that evolve alongside their rapidly advancing underlying technologies.