A Library of LLM Intrinsics for Retrieval-Augmented Generation

📅 2025-04-16

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

Current RAG development lacks a vendor-agnostic, stable, and interoperable API standard, resulting in high system coupling and poor reusability. Method: This paper introduces the “LLM Intrinsic Capabilities” paradigm—inspired by compiler built-in functions—to establish a standardized capability interface framework for RAG. It defines lightweight, version-controlled, composable abstract APIs that decouple capability invocation from underlying implementations. Technically, it integrates LoRA-based fine-tuning adaptation, vLLM-accelerated inference, structured I/O design, and Hugging Face–enabled collaborative publishing. Contribution/Results: We open-source multiple plug-and-play intrinsic modules (e.g., query rewriting, context compression), enabling API-level reuse and multi-module chained orchestration. Experiments demonstrate significant improvements in RAG system development efficiency, maintainability, and cross-platform compatibility.

Technology Category

Application Category

📝 Abstract

In the developer community for large language models (LLMs), there is not yet a clean pattern analogous to a software library, to support very large scale collaboration. Even for the commonplace use case of Retrieval-Augmented Generation (RAG), it is not currently possible to write a RAG application against a well-defined set of APIs that are agreed upon by different LLM providers. Inspired by the idea of compiler intrinsics, we propose some elements of such a concept through introducing a library of LLM Intrinsics for RAG. An LLM intrinsic is defined as a capability that can be invoked through a well-defined API that is reasonably stable and independent of how the LLM intrinsic itself is implemented. The intrinsics in our library are released as LoRA adapters on HuggingFace, and through a software interface with clear structured input/output characteristics on top of vLLM as an inference platform, accompanied in both places with documentation and code. This article describes the intended usage, training details, and evaluations for each intrinsic, as well as compositions of multiple intrinsics.

Problem

Research questions and friction points this paper is trying to address.

Lack of standardized APIs for LLM collaboration in RAG

Need for stable, implementation-agnostic LLM capabilities

Absence of a unified library for large-scale RAG development

Innovation

Methods, ideas, or system contributions that make the work stand out.

Library of LLM Intrinsics for RAG

LoRA adapters on HuggingFace

Structured I/O interface with vLLM

🔎 Similar Papers

LRP4RAG: Detecting Hallucinations in Retrieval-Augmented Generation via Layer-wise Relevance Propagation