🤖 AI Summary
Current RAG development lacks a vendor-agnostic, stable, and interoperable API standard, resulting in high system coupling and poor reusability. Method: This paper introduces the “LLM Intrinsic Capabilities” paradigm—inspired by compiler built-in functions—to establish a standardized capability interface framework for RAG. It defines lightweight, version-controlled, composable abstract APIs that decouple capability invocation from underlying implementations. Technically, it integrates LoRA-based fine-tuning adaptation, vLLM-accelerated inference, structured I/O design, and Hugging Face–enabled collaborative publishing. Contribution/Results: We open-source multiple plug-and-play intrinsic modules (e.g., query rewriting, context compression), enabling API-level reuse and multi-module chained orchestration. Experiments demonstrate significant improvements in RAG system development efficiency, maintainability, and cross-platform compatibility.
📝 Abstract
In the developer community for large language models (LLMs), there is not yet a clean pattern analogous to a software library, to support very large scale collaboration. Even for the commonplace use case of Retrieval-Augmented Generation (RAG), it is not currently possible to write a RAG application against a well-defined set of APIs that are agreed upon by different LLM providers. Inspired by the idea of compiler intrinsics, we propose some elements of such a concept through introducing a library of LLM Intrinsics for RAG. An LLM intrinsic is defined as a capability that can be invoked through a well-defined API that is reasonably stable and independent of how the LLM intrinsic itself is implemented. The intrinsics in our library are released as LoRA adapters on HuggingFace, and through a software interface with clear structured input/output characteristics on top of vLLM as an inference platform, accompanied in both places with documentation and code. This article describes the intended usage, training details, and evaluations for each intrinsic, as well as compositions of multiple intrinsics.