🤖 AI Summary
This work addresses the challenge of efficiently selecting high-quality in-context examples under limited prompt budgets to improve model accuracy while controlling computational overhead. We propose Meta-Sel, a lightweight supervised meta-learning approach that, for the first time, applies supervised meta-learning to example selection. Meta-Sel trains an interpretable scoring function—combining TF-IDF cosine similarity and length compatibility ratio—on a constructed meta-dataset, enabling fast, deterministic, and auditable example ranking without fine-tuning or additional large model invocations. Experiments across four intent classification datasets and five open-source large language models demonstrate that Meta-Sel consistently achieves superior performance, with particularly notable gains for smaller models, all while maintaining minimal selection overhead.
📝 Abstract
Demonstration selection is a practical bottleneck in in-context learning (ICL): under a tight prompt budget, accuracy can change substantially depending on which few-shot examples are included, yet selection must remain cheap enough to run per query over large candidate pools. We propose Meta-Sel, a lightweight supervised meta-learning approach for intent classification that learns a fast, interpretable scoring function for (candidate, query) pairs from labeled training data. Meta-Sel constructs a meta-dataset by sampling pairs from the training split and using class agreement as supervision, then trains a calibrated logistic regressor on two inexpensive meta-features: TF--IDF cosine similarity and a length-compatibility ratio. At inference time, the selector performs a single vectorized scoring pass over the full candidate pool and returns the top-k demonstrations, requiring no model fine-tuning, no online exploration, and no additional LLM calls. This yields deterministic rankings and makes the selection mechanism straightforward to audit via interpretable feature weights. Beyond proposing Meta-Sel, we provide a broad empirical study of demonstration selection, benchmarking 12 methods -- spanning prompt engineering baselines, heuristic selection, reinforcement learning, and influence-based approaches -- across four intent datasets and five open-source LLMs. Across this benchmark, Meta-Sel consistently ranks among the top-performing methods, is particularly effective for smaller models where selection quality can partially compensate for limited model capacity, and maintains competitive selection-time overhead.