🤖 AI Summary
This work addresses the challenge of efficiently implementing the ORDER BY operator in large language models (LLMs). We propose the first LLM-specific sorting logic abstraction and unified evaluation framework. Methodologically, we introduce a consensus-based batching strategy, a majority-voting-driven pairwise comparison mechanism, and a bidirectional external merge sort tailored to LLM inference characteristics; these are augmented with value-based sorting and batch-size adaptation, and rigorously evaluated on models including GPT-4o. Experiments across multiple datasets and models demonstrate significant improvements in both sorting accuracy and throughput efficiency. Moreover, we uncover, for the first time, a logarithmic-linear trade-off between computational cost and sorting quality—enabling superior cost–quality balance. Our framework establishes foundational principles for scalable, accurate, and efficient sorting in LLM-driven database systems.
📝 Abstract
We present the LLM ORDER BY operator as a logical abstraction and study its physical implementations within a unified evaluation framework. Our experiments show that no single approach is universally optimal, with effectiveness depending on query characteristics and data. We introduce three new designs: an agreement-based batch-size policy, a majority voting mechanism for pairwise sorting, and a two-way external merge sort adapted for LLMs. With extensive experiments, our agreement-based procedure is effective at determining batch size for value-based methods, the majority-voting mechanism consistently strengthens pairwise comparisons on GPT-4o, and external merge sort achieves high accuracy-efficiency trade-offs across datasets and models. We further observe a log-linear scaling between compute cost and ordering quality, offering the first step toward principled cost models for LLM powered data systems.