SecureRouter: Encrypted Routing for Efficient Secure Inference

📅 2026-04-16

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work addresses the inefficiency and accuracy trade-off in existing encrypted neural network inference systems, which rely on fixed models for all inputs, resulting in high latency and cost. The paper proposes the first end-to-end encrypted dynamic routing framework that adaptively selects the optimal model size for Transformer inference directly on ciphertext, based on input features. By integrating secure multi-party computation (MPC), an MPC-overhead-aware encrypted router, a co-optimized model pool, and quantization strategies, the framework unifies secure routing, inference, and protocol execution while preserving the confidentiality of both data and models. Experimental results demonstrate that the approach reduces inference latency by up to 1.95× compared to state-of-the-art methods with negligible accuracy loss, offering a practical solution for scalable and secure AI inference.

Technology Category

Application Category

📝 Abstract

Cryptographically secure neural network inference typically relies on secure computing techniques such as Secure Multi-Party Computation (MPC), enabling cloud servers to process client inputs without decrypting them. Although prior privacy-preserving inference systems co-design network optimizations with MPC, they remain slow and costly, limiting real-world deployment. A major bottleneck is their use of a single, fixed transformer model for all encrypted inputs, ignoring that different inputs require different model sizes to balance efficiency and accuracy. We present SecureRouter, an end-to-end encrypted routing and inference framework that accelerates secure transformer inference through input-adaptive model selection under encryption. SecureRouter establishes a unified encrypted pipeline that integrates a secure router with an MPC-optimized model pool, enabling coordinated routing, inference, and protocol execution while preserving full data and model confidentiality. The framework includes training-phase and inference-phase components: an MPC-cost-aware secure router that predicts per-model utility and cost from encrypted features, and an MPC-optimized model pool whose architectures and quantization schemes are co-trained to minimize MPC communication and computation overhead. Compared to prior work, SecureRouter achieves a latency reduction by 1.95x with negligible accuracy loss, offering a practical path toward scalable and efficient secure AI inference. Our open-source implementation is available at: https://github.com/UCF-ML-Research/SecureRouter

Problem

Research questions and friction points this paper is trying to address.

Secure Inference

Secure Multi-Party Computation

Transformer Models

Model Selection

Encrypted Routing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Secure Multi-Party Computation

Input-Adaptive Inference

Encrypted Routing