Iterative In-Context Learning to Enhance LLMs Abstract Reasoning: The Case-Study of Algebraic Tasks

📅 2025-09-01

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Large language models (LLMs) exhibit systematic deficits in compositional generalization for abstract reasoning tasks—particularly algebraic evaluation under out-of-distribution operator precedence rules (e.g., addition before multiplication). To address this, we propose iterative in-context learning: dynamically selecting and refining a small set of demonstration examples, augmented with explicit step-by-step reasoning instructions and iterative prompting. Crucially, we find that using *simpler* (rather than distribution-matched) examples—those with fewer operators or shallower nesting than test instances—significantly improves zero-shot generalization, challenging conventional wisdom on example selection. Empirical evaluation across multiple nonstandard precedence benchmarks demonstrates substantial gains over strong baselines: up to +27.4% absolute accuracy improvement in addition-first arithmetic. Our approach offers a novel pathway toward enhancing LLMs’ structured reasoning and compositional generalization capabilities without architectural modification or fine-tuning.

Technology Category

Application Category

📝 Abstract

LLMs face significant challenges in systematic generalization, particularly when dealing with reasoning tasks requiring compositional rules and handling out-of-distribution examples. To address these challenges, we introduce an in-context learning methodology that improves the generalization capabilities of general purpose LLMs. Our approach employs an iterative example selection strategy, which incrementally constructs a tailored set of few-shot examples optimized to enhance model's performance on a given task. As a proof of concept, we apply this methodology to the resolution of algebraic expressions involving non-standard simplification rules, according to which the priority of addition and multiplication is changed. Our findings indicate that LLMs exhibit limited proficiency in these mathematical tasks. We further demonstrate that LLMs reasoning benefits from our iterative shot selection prompting strategy integrated with explicit reasoning instructions. Crucially, our experiments reveal that some LLMs achieve better generalization performances when prompted with simpler few-shot examples rather than complex ones following the test data distribution.

Problem

Research questions and friction points this paper is trying to address.

Enhancing LLMs' systematic generalization in reasoning tasks

Improving algebraic expression resolution with non-standard rules

Addressing limited proficiency in compositional mathematical reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative example selection strategy

Tailored few-shot examples optimization

Explicit reasoning instructions integration

🔎 Similar Papers

BloomWise: Enhancing Problem-Solving capabilities of Large Language Models using Bloom's-Taxonomy-Inspired Prompts