Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB

📅 2025-04-01

📈 Citations: 0

✨ Influential: 0

career value

155K/year

🤖 AI Summary

To address the challenge of jointly retrieving structured tabular data and unstructured text in knowledge-intensive analytical workloads, this paper proposes a unified execution framework that embeds LLMs and RAG directly within DuckDB. Methodologically, it introduces three key innovations: (1) the first abstraction of PROMPTs and MODELs as first-class SQL DDL schema objects; (2) a cost-aware LLM operator optimization mechanism—supporting batched inference and caching—within a resource-agnostic execution engine; and (3) native support for LLM-based scalar and aggregate functions, seamless RAG integration, and relational-algebra-inspired SQL query optimization with model orchestration. The framework significantly reduces end-to-end development complexity while preserving SQL’s declarative simplicity. It enables tuple-level chained prediction, achieving low-overhead, high-throughput LLM inference without sacrificing expressiveness or efficiency.

Technology Category

Application Category

📝 Abstract

Knowledge-intensive analytical applications retrieve context from both structured tabular data and unstructured, text-free documents for effective decision-making. Large language models (LLMs) have made it significantly easier to prototype such retrieval and reasoning data pipelines. However, implementing these pipelines efficiently still demands significant effort and has several challenges. This often involves orchestrating heterogeneous data systems, managing data movement, and handling low-level implementation details, e.g., LLM context management. To address these challenges, we introduce FlockMTL: an extension for DBMSs that deeply integrates LLM capabilities and retrieval-augmented generation (RAG). FlockMTL includes model-driven scalar and aggregate functions, enabling chained predictions through tuple-level mappings and reductions. Drawing inspiration from the relational model, FlockMTL incorporates: (i) cost-based optimizations, which seamlessly apply techniques such as batching and caching; and (ii) resource independence, enabled through novel SQL DDL abstractions: PROMPT and MODEL, introduced as first-class schema objects alongside TABLE. FlockMTL streamlines the development of knowledge-intensive analytical applications, and its optimizations ease the implementation burden.

Problem

Research questions and friction points this paper is trying to address.

Integrating LLMs and RAG into DBMS for analytics

Reducing effort in retrieval-reasoning pipeline implementation

Optimizing resource use and cost in data systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep integration of LLMs and RAG into DuckDB

Model-driven scalar and aggregate functions

Cost-based optimizations with SQL DDL abstractions

🔎 Similar Papers

No similar papers found.