Beyond Linear LLM Invocation: An Efficient and Effective Semantic Filter Paradigm

๐Ÿ“… 2026-03-05
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the inefficiency of conventional semantic filtering methods in large language models (LLMs), which incur linear complexity, high latency, and substantial token consumption by invoking the LLM separately for each tuple. To overcome this limitation, the authors propose the Clustering-Sampling-Voting (CSV) frameworkโ€”the first approach to break the linear invocation barrier. CSV leverages semantic embeddings for clustering, evaluates a small sampled subset, and aggregates results via novel voting strategies, UniVote and SimVote, thereby reducing LLM invocation complexity to sublinear. A dynamic reclustering mechanism further enhances robustness across diverse datasets. Experimental results on real-world data demonstrate that CSV reduces LLM calls by 1.28ร— to 355ร— compared to state-of-the-art methods while maintaining comparable accuracy and F1 scores.

Technology Category

Application Category

๐Ÿ“ Abstract
Large language models (LLMs) are increasingly used for semantic query processing over large corpora. A set of semantic operators derived from relational algebra has been proposed to provide a unified interface for expressing such queries, among which the semantic filter operator serves as a cornerstone. Given a table T with a natural language predicate e, for each tuple in the relation, the execution of a semantic filter proceeds by constructing an input prompt that combines the predicate e with its content, querying the LLM, and obtaining the binary decision. However, this tuple-by-tuple evaluation necessitates a complete linear scan of the table, incurring prohibitive latency and token costs. Although recent work has attempted to optimize semantic filtering, it still does not break the linear LLM invocation barriers. To address this, we propose Clustering-Sampling-Voting (CSV), a new framework that reduces LLM invocations to sublinear complexity while providing error guarantees. CSV embeds tuples into semantic clusters, samples a small subset for LLM evaluation, and infers cluster-level labels via two proposed voting strategies: UniVote, which aggregates labels uniformly, and SimVote, which weights votes by semantic similarity. Moreover, CSV triggers re-clustering on ambiguous clusters to ensure robustness across diverse datasets. The results conducted on real-world datasets demonstrate that CSV reduces the number of LLM calls by 1.28-355x compared to the state-of-the-art approaches, while maintaining comparable effectiveness in terms of Accuracy and F1 score.
Problem

Research questions and friction points this paper is trying to address.

semantic filtering
large language models
linear LLM invocation
query processing
efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

sublinear LLM invocation
semantic filtering
Clustering-Sampling-Voting
voting strategy
efficiency optimization