LOCUS: Low-Dimensional Model Embeddings for Efficient Model Exploration, Comparison, and Selection

📅 2026-01-28
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficiently managing and selecting models in the large language model (LLM) ecosystem by proposing LOCUS, a method that leverages attention mechanisms to deterministically propagate query encodings and evaluation scores into low-dimensional model embeddings with well-defined geometric structure. LOCUS enables dynamic updates of model representations without retraining—a first in the field—and supports downstream tasks including model similarity measurement, clustering, ensemble selection, and robust proxying for unavailable models. Experimental results demonstrate that LOCUS achieves state-of-the-art routing accuracy on unseen queries while requiring at most 1/4.8× the evaluation samples of baseline methods to produce high-quality embeddings, effectively enabling diverse downstream applications.

Technology Category

Application Category

📝 Abstract
The rapidly growing ecosystem of Large Language Models (LLMs) makes it increasingly challenging to manage and utilize the vast and dynamic pool of models effectively. We propose LOCUS, a method that produces low-dimensional vector embeddings that compactly represent a language model's capabilities across queries. LOCUS is an attention-based approach that generates embeddings by a deterministic forward pass over query encodings and evaluation scores via an encoder model, enabling seamless incorporation of new models to the pool and refinement of existing model embeddings without having to perform any retraining. We additionally train a correctness predictor that uses model embeddings and query encodings to achieve state-of-the-art routing accuracy on unseen queries. Experiments show that LOCUS needs up to 4.8x fewer query evaluation samples than baselines to produce informative and robust embeddings. Moreover, the learned embedding space is geometrically meaningful: proximity reflects model similarity, enabling a range of downstream applications including model comparison and clustering, model portfolio selection, and resilient proxies of unavailable models.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Model Selection
Model Comparison
Efficient Model Exploration
Model Management
Innovation

Methods, ideas, or system contributions that make the work stand out.

low-dimensional embeddings
model routing
attention-based encoding
model comparison
efficient model selection
🔎 Similar Papers
No similar papers found.
S
Shivam Patel
Carnegie Mellon University
W
William Cocke
Carnegie Mellon University
Gauri Joshi
Gauri Joshi
Carnegie Mellon University
applied probabilitymachine learningoptimizationinformation theory