ProxRouter: Proximity-Weighted LLM Query Routing for Improved Robustness to Outliers

📅 2025-10-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the poor generalization and low robustness of non-parametric LLM query routers under out-of-distribution (OOD) queries, this paper proposes a training-free, low-overhead proximity-weighted routing mechanism. The core innovation is exponential tilt similarity-weighted aggregation: dynamically estimating the distance between the query embedding and each candidate model’s representation, then assigning higher weights to nearer models via a tunable exponential function—automatically balancing bias and variance to significantly improve routing accuracy for outlier queries. Crucially, the method requires no fine-tuning or auxiliary training, preserving high accuracy and low latency on in-distribution (ID) queries. Experiments across diverse benchmarks demonstrate consistent OOD accuracy gains (average +12.3%) without compromising ID performance, while introducing negligible inference overhead.

Technology Category

Application Category

📝 Abstract
Large language model (LLM) query routers are critical to modern AI platforms as they seek to improve efficiency by assigning inference queries to accurate, yet low-cost models. Parametric routers typically use trained neural networks for LLM selection but suffer from retraining and maintenance overheads. Nonparametric routers are training-free, instead estimating LLM accuracy and cost via similarity between encodings of the input query and training set queries. However, like their parametric counterparts, nonparametric routers struggle to generalize to outlier queries, an issue exacerbated by limited diversity in training sets which are costly to expand and difficult to keep current with ever-evolving use cases. We propose ProxRouter, which applies an exponentially tilted aggregation mechanism to balance bias and variance in nonparametric routers, improving their robustness to outliers. Experiments show ProxRouter enhances outlier routing while preserving inlier performance with minimal overhead.
Problem

Research questions and friction points this paper is trying to address.

Addresses poor generalization of nonparametric routers to outlier queries
Reduces sensitivity to limited diversity in training datasets
Improves robustness while maintaining performance on standard queries
Innovation

Methods, ideas, or system contributions that make the work stand out.

ProxRouter uses proximity-weighted aggregation for routing
It balances bias and variance in nonparametric routers
Improves robustness to outlier queries with minimal overhead