A Survey to Recent Progress Towards Understanding In-Context Learning

📅 2024-02-03
📈 Citations: 8
Influential: 0
📄 PDF

career value

200K/year
🤖 AI Summary
This paper addresses a fundamental question in in-context learning (ICL): how large language models achieve zero-gradient generalization solely from task demonstrations in prompts. Critiquing the prevailing conceptual conflation of *skill identification* and *skill learning*, as well as fragmented analytical frameworks, we formally define both notions and propose a unified analysis framework grounded in data generation. Our framework reveals their shared reliance on implicit contextual modeling, while rigorously distinguishing identification—as pattern matching over context—versus learning—as distributional adaptation across tasks. Through conceptual abstraction, cross-method systematic review, and generative modeling paradigm reconstruction, we resolve key interpretability ambiguities in ICL, clarify the essential pathways to generalization and novel skill acquisition, and establish a theoretical foundation and methodological toolkit for controllable, predictable ICL prompt engineering.

Technology Category

Application Category

📝 Abstract
In-Context Learning (ICL) empowers Large Language Models (LLMs) with the ability to learn from a few examples provided in the prompt, enabling downstream generalization without the requirement for gradient updates. Despite encouragingly empirical success, the underlying mechanism of ICL remains unclear. Existing research remains ambiguous with various viewpoints, utilizing intuition-driven and ad-hoc technical solutions to interpret ICL. In this paper, we leverage a data generation perspective to reinterpret recent efforts from a systematic angle, demonstrating the potential broader usage of these popular technical solutions. For a conceptual definition, we rigorously adopt the terms of skill recognition and skill learning. Skill recognition selects one learned data generation function previously seen during pre-training while skill learning can learn new data generation functions from in-context data. Furthermore, we provide insights into the strengths and weaknesses of both abilities, emphasizing their commonalities through the perspective of data generation. This analysis suggests potential directions for future research.
Problem

Research questions and friction points this paper is trying to address.

Contextual Learning
Skill Recognition
Learning Ability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Skill Recognition
Skill Learning
Instruction-based Context Learning