Privacy-Preserving LLMs Routing

📅 2026-04-17
📈 Citations: 0
Influential: 0
📄 PDF

career value

233K/year
🤖 AI Summary
This work addresses the privacy risks inherent in large language model (LLM) routing while overcoming the prohibitive computational overhead of existing cryptographic solutions. To this end, we propose PPRoute, a framework that enables efficient and accurate privacy-preserving LLM routing under secure multi-party computation (MPC). The key innovations include an MPC-friendly encoder architecture, a multi-stage training algorithm operating directly in the encrypted domain, and a constant-communication-complexity (O(1)) oblivious Top-k selection mechanism. Experimental results demonstrate that PPRoute achieves accuracy on par with plaintext routing across multiple datasets while offering approximately 20× speedup over a naive MPC implementation.

Technology Category

Application Category

📝 Abstract
Large language model (LLM) routing has emerged as a critical strategy to balance model performance and cost-efficiency by dynamically selecting services from various model providers. However, LLM routing adds an intermediate layer between users and LLMs, creating new privacy risks to user data. These privacy risks have not been systematically studied. Although cryptographic techniques such as Secure Multi-Party Computation (MPC) enable privacy-preserving computation, their protocol design and implementation remain under-explored, and naïve implementations typically incur prohibitive computational overhead. To address this, we propose a privacy-preserving LLM routing framework (PPRoute). PPRoute includes multiple strategies to speed up encoder inference and nearest neighbor search under the MPC and maintain the quality of LLM routing. First, PPRoute uses MPC-friendly operations to boost the encoder inference. Second, PPRoute uses a multiple-step model training algorithm to maintain routing quality despite the constraints of the encrypted domain. Third, PPRoute proposes an unsorted Top-k algorithm with $O(1)$ communication complexity for secure sorting in model search, significantly reducing communication latency. Across different datasets, PPRoute achieves the performance of plaintext counterparts, while achieving approximately a 20$\times$ speedup over naïve MPC implementations.
Problem

Research questions and friction points this paper is trying to address.

Privacy-Preserving
LLM Routing
Secure Multi-Party Computation
User Data Privacy
Computational Overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

Privacy-Preserving LLM Routing
Secure Multi-Party Computation (MPC)
MPC-friendly Encoder Inference
Unsorted Top-k Algorithm
Encrypted Domain Training