AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models

📅 2025-04-30

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Small language models (SLMs) suffer from cognitive overload in skill-oriented in-context learning (ICL) due to redundant skill demonstrations, leading to degraded performance. To address this, we introduce cognitive load theory into ICL prompt design—first of its kind—and propose AdaptMI+, a lightweight framework for SLMs (1B–7B parameters). AdaptMI+ employs metacognitive skill prediction to dynamically trigger skill injection only when model performance falters, complemented by fine-grained skill gap identification and targeted example generation. Evaluated under a 5-shot setting across five mainstream mathematical reasoning benchmarks and five SLMs from the Qwen and Llama families, AdaptMI+ achieves an average accuracy improvement of 6%. This substantially narrows the ICL capability gap between SLMs and large language models, enabling efficient, cognition-aware adaptation without architectural modification or parameter tuning.

Technology Category

Application Category

📝 Abstract

In-context learning (ICL) allows a language model to improve its problem-solving capability when provided with suitable information in context. Since the choice of in-context information can be determined based on the problem itself, in-context learning is analogous to human learning from teachers in a classroom. Recent works (Didolkar et al., 2024a; 2024b) show that ICL performance can be improved by leveraging a frontier large language model's (LLM) ability to predict required skills to solve a problem, popularly referred to as an LLM's metacognition, and using the recommended skills to construct necessary in-context examples. While this skill-based strategy boosts ICL performance in larger models, its gains on small language models (SLMs) have been minimal, highlighting a performance gap in ICL capabilities. We investigate this gap and show that skill-based prompting can hurt SLM performance on easy questions by introducing unnecessary information, akin to cognitive overload. To address this, we introduce AdaptMI, an adaptive approach to selecting skill-based in-context Math Instructions for SLMs. Inspired by cognitive load theory from human pedagogy, our method only introduces skill-based examples when the model performs poorly. We further propose AdaptMI+, which adds examples targeted to the specific skills missing from the model's responses. On 5-shot evaluations across popular math benchmarks and five SLMs (1B--7B; Qwen, Llama), AdaptMI+ improves accuracy by up to 6% over naive skill-based strategies.

Problem

Research questions and friction points this paper is trying to address.

Improving small language models' math problem-solving via adaptive skill-based examples

Addressing cognitive overload in skill-based prompting for small language models

Enhancing accuracy of small models on math benchmarks with targeted skill examples

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive skill-based in-context learning for SLMs

Selects skill examples based on model performance

Targets missing skills to improve accuracy

🔎 Similar Papers

No similar papers found.