Routing with Generated Data: Annotation-Free LLM Skill Estimation and Expert Selection

📅 2026-01-14

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the challenge of effectively training large language model (LLM) routers to dynamically select the optimal expert model in the absence of human-annotated data. The authors propose CASCAL, a method that trains routing systems exclusively on query–answer pairs auto-generated by a generator LLM, eliminating the need for manual labels. CASCAL enhances robustness to low-quality synthetic data by estimating model correctness through consensus voting and identifying each expert’s domain of specialization via hierarchical clustering. Experimental results demonstrate that, when trained on data from weak generator models, CASCAL achieves a 4.6% absolute improvement in accuracy over the best query–answer routing baseline, confirming its effectiveness and practicality in query-only routing architectures.

Technology Category

Application Category

📝 Abstract

Large Language Model (LLM) routers dynamically select optimal models for given inputs. Existing approaches typically assume access to ground-truth labeled data, which is often unavailable in practice, especially when user request distributions are heterogeneous and unknown. We introduce Routing with Generated Data (RGD), a challenging setting in which routers are trained exclusively on generated queries and answers produced from high-level task descriptions by generator LLMs. We evaluate query-answer routers (using both queries and labels) and query-only routers across four diverse benchmarks and 12 models, finding that query-answer routers degrade faster than query-only routers as generator quality decreases. Our analysis reveals two crucial characteristics of effective generators: they must accurately respond to their own questions, and their questions must produce sufficient performance differentiation among the model pool. We then show how filtering for these characteristics can improve the quality of generated data. We further propose CASCAL, a novel query-only router that estimates model correctness through consensus voting and identifies model-specific skill niches via hierarchical clustering. CASCAL is substantially more robust to generator quality, outperforming the best query-answer router by 4.6% absolute accuracy when trained on weak generator data.

Problem

Research questions and friction points this paper is trying to address.

LLM routing

annotation-free

expert selection

generated data

skill estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Routing with Generated Data

Annotation-Free

LLM Skill Estimation