iPDB -- Optimizing SQL Queries with ML and LLM Predicates

📅 2026-01-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of traditional SQL and relational databases in efficiently supporting semantic queries for machine learning (ML) and large language models (LLMs), which often necessitate complex engineering and data migration. To overcome this, the paper proposes iPDB, a system that natively integrates ML/LLM inference capabilities within the database by extending SQL syntax to support semantic projection, selection, join, and grouping operations. The key innovations include the introduction of relational prediction operators and a semantic query optimization mechanism, enabling the first unified SQL framework that seamlessly and efficiently combines classical relational algebra with model-based inference. Experimental results demonstrate that iPDB allows concise expression of semantic queries in SQL while achieving superior execution efficiency compared to state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
Structured Query Language (SQL) has remained the standard query language for databases. SQL is highly optimized for processing structured data laid out in relations. Meanwhile, in the present application development landscape, it is highly desirable to utilize the power of learned models to perform complex tasks. Large language models (LLMs) have been shown to understand and extract information from unstructured textual data. However, SQL as a query language and accompanying relational database systems are either incompatible or inefficient for workloads that require leveraging learned models. This results in complex engineering and multiple data migration operations that move data between the data sources and the model inference platform. In this paper, we present iPDB, a relational system that supports in-database machine learning (ML) and large language model (LLM) inferencing using extended SQL syntax. In iPDB, LLMs and ML calls can function as semantic projects, as predicates to perform semantic selects and semantic joins, or for semantic grouping in group-by clauses. iPDB has a novel relational predict operator and semantic query optimizations that enable users to write and efficiently execute semantic SQL queries, outperforming the state-of-the-art.
Problem

Research questions and friction points this paper is trying to address.

SQL
Large Language Models
In-database Machine Learning
Semantic Queries
Relational Databases
Innovation

Methods, ideas, or system contributions that make the work stand out.

in-database ML
LLM integration
semantic SQL
relational predict operator
query optimization
🔎 Similar Papers
No similar papers found.