🤖 AI Summary
To address the problem of efficient few-shot example selection in large language model (LLM) in-context learning, this paper proposes CASE, a framework that formulates subset selection as a top-m identification task under a stochastic linear multi-armed bandit setting. Methodologically, CASE introduces a context-aware active evaluation mechanism and a parameterized linear scoring function; it further employs a dynamically maintained shortlist of “challenger” candidate arms to enable selective exploration. Crucially, CASE is the first approach to decouple sample complexity from context optimization—enabling principled, query-efficient selection without requiring full-context re-evaluation. Experiments across multiple NLP benchmarks demonstrate that CASE matches state-of-the-art accuracy while reducing LLM API calls by 87% and accelerating inference by 7×, thereby substantially lowering computational overhead.
📝 Abstract
The in-context learning paradigm with LLMs has been instrumental in advancing a wide range of natural language processing tasks. The selection of few-shot examples (exemplars / demonstration samples) is essential for constructing effective prompts under context-length budget constraints. In this paper, we formulate the exemplar selection task as a top-m best arms identification problem. A key challenge in this setup is the exponentially large number of arms that need to be evaluated to identify the m-best arms. We propose CASE (Challenger Arm Sampling for Exemplar selection), a novel sample-efficient selective exploration strategy that maintains a shortlist of"challenger"arms, which are current candidates for the top-m arms. In each iteration, only one of the arms from this shortlist or the current topm set is pulled, thereby reducing sample complexity and, consequently, the number of LLM evaluations. Furthermore, we model the scores of exemplar subsets (arms) using a parameterized linear scoring function, leading to stochastic linear bandits setting. CASE achieves remarkable efficiency gains of up to 7x speedup in runtime while requiring 7x fewer LLM calls (87% reduction) without sacrificing performance compared to state-of-the-art exemplar selection methods. We release our code and data at https://github.com/kiranpurohit/CASE